Blog Articles 176–180

Sales Tax and E-Commerce — Not a Simple Problem

In the age of e-commerce, sales taxes are a difficult problem. Currently, online retailers such as Amazon.com are under the same rule as mail-order sellers traditionally have been: they only have to collect sales tax from customers located in their state (or states in which they have a physical presence). Recently, this has gained greater attention due to several states passing measures to count affiliate program members (kickbacks for links on blogs, etc.) as a physical presence, so Amazon.com would be required to collect sales tax from customers in any state in which one of its affiliates resides. These affiliates are often private individuals and do no direct sales for Amazon — only referrals — but, in an effort to regain their tax base, states are wanting to see them as a physical presence.

I do not question that e-commerce is presenting significant problems for local economies. Money spent online does not stay in the local economy, and moving sales out-of-state does decrease the tax base for state income taxes. While most states require residents to pay sales tax themselves on out-of-state purchases, it is likely that few actually do so. Particularly in the present time of tight state budgets, this certainly isn’t helping matters.

It is tempting to just say, as many states are attempting to, that Amazon.com should collect sales tax sales to their residents. Citizens for Tax Justice recently published an article accusing Amazon.com of fostering tax evasion and calling the Supreme Court ruling instituting the current system “misguided”.

I think this is an overly simplistic analysis of the situation. There is a complex tangle of issues surrounding sales tax in the U.S., and placing local requirements on Internet-based sellers creates further problems that I think are likely to be worse than the current woes.

Getting Things Typed: External Trusted Systems for Programming

One of the major tenants of David Allen’s Getting Things Done methodology is the concept of an external trusted system — a system for storing information outside your brain so that it can be retrieved as needed and/or brought to your attention when appropriate. Our brains are often fickle, and we are apt to forget things. Further, by trying to remember them, we spend mental energy trying not to forget them so that, even if we do remember, our productivity is decreased by the stress of trying not to forget. Getting notes, appointments, tasks, and pretty much anything else we need to remember out of our heads and into a reliable external storage and retrieval system enables us to free up our minds to focus on what we really want to accomplish.

I’ve been realizing lately that robust static type and module systems fill a similar role when programming. I have better things to do with my brain cycles than remember the details of functions, what they require, and where they are used.

A module and interface system like OCaml’s makes it easy to refer to the function header — its summary — when I need t recall its usage. Documentation extractors do provide some of this benefit, and languages like Java provide similar benefits with their amenability to static analysis and good support enabling auto-completion and other IDE lookup features. In a static language, however, the type system explicitly delineates the permissible inputs and possible outputs for a function without requiring the programmer to list them manually. Therefore, the documentation just needs to describe behavior and any special requirements beyond those expressible in the type system (and the more expressive the type system, the fewer these requirements are likely to be). Therefore, the information necessary to call a function is retrievable when needed.

The type system also enables the remind-when-appropriate aspect of an external trusted system. If I get something wrong when calling a function, there’s a decent chance the compiler will remind me when I compile the code. If I change a function, I don’t have to worry about remembering where all it was used; the type system will catch a large set of errors next compile cycle.

Fixing the Dash Lights on a Dodge Caravan

We had a problem this last week with our ’03 Dodge Grand Caravan — the dash lights went out. Completely. Instrument panel, radio, heater controls — all unlit. My first thought, naturally, was a fuse.

However, when I looked at the fuse box, I couldn’t find any fuse that looked like it controlled the instrument panel backlighting. Web searching turned up a few things, including a fixya entry and a DodgeTalk.com forum post which document the same problem and an odd fix: disconnect the battery or otherwise cut power to the computer.

So, I went out and pulled the IOD fuse (Ignition Off Draw, controls the power drawn when the vehicle is off) for a couple hours. Disconnecting the negative cable on the battery would accomplish the same thing for this purpose. After putting the fuse back in, the dash lights worked.

Piecing things together, particularly with the insights from the DodgeTalk post, it seems that the issue is a computer problem — sometimes, for some reason, the computer will stop turning on the dash lights. Disconnecting power to it for a while resets the computer, allowing the dash lights to start working again. Weirdest van repair ever, but it works, and here it is documented so others can hopefully find that the solution does, indeed, work.

Tuning the OCaml memory allocator for large data processing jobs

TL;DR: setting OCAMLRUNPARAM=s=4M,i=32M,o=150 can make your OCaml programs run faster. Read on for details and how to see if the garbage collector is thrashing and thereby slowing down your program.

In my research work with GroupLens, I do a most of my coding for data processing, algorithm implementation, etc. in OCaml. Sometimes I have to suffer a bit for this when some nice library doesn’t have OCaml bindings, but in general it works out fairly well. And every time I go to do some refactoring, I am reminded why I’m not coding in Python.

One thing I have found, however, is that the default OCaml garbage collector parameters are not very well-suited for much of my work — frequently long-running data processing tasks building and manipulating large, often persistent1 data structures. The program will run somewhat slow (although there usually isn’t anything to compare it against), but more importantly, profiling with gprof will reveal that my program is spending a substantial amount of its time (~30% or more) in the OCaml garbage collector (if memory serves, frequently in the function caml_gc_major_slice).

My first OCaml syntax extension

Preface: In this post, I describe my adventures figuring out how to write a syntax extension for the OCaml programming language and attempt to provide something of a tutorial on writing a basic extension. I assume that you’re somewhat familiar with basic parsing technology and context-free grammars — if not, a good tutorial on parser construction with a tool like Yacc would be worth a read first.

One of the oft-touted benefits of OCaml is Camlp4, a pre-processor that facilitates extending the OCaml syntax to provide natural support for various constructions. This has been used for a variety of purposes, such as database type-checking, monad sugaring, and logging. In the hands of a capable author, a variety of wonders can be introduced to the OCaml language.

I’ve used syntax extensions for some time now, particularly PGOCaml and pa_lwt, to make much life with OCaml easier. I’d never written one, however, and found the documentation and other relevant material rather intimidating. Camlp4 documentation is somewhat hard to find, particularly for the current version (with OCaml 3.10, they made significant backwards-incompatible changes to Camlp4; much of the available tutorial and reference material was thus somewhat obsolete). The documentation that was around I find difficult to start with, particularly since I want to understand what the code I write does and not just cargo-cult it.

But I finally bit the bullet and learned. And when all was said and done, I have 13 lines of code which provide a small sugar — sort of a minimal syntax extension. This extension provides pattern matching over lazy lists, much like llists but far simpler (and based on the Batteries lazy list module). Here it is, in its entirety, and then I’ll explain how it works and what’s needed to get stared with the bare basics of extending OCaml syntax: