Tuning the OCaml memory allocator for large data processing jobs
TL;DR: setting OCAMLRUNPARAM=s=4M,i=32M,o=150
can make your OCaml programs run faster. Read on for details and how to see if the garbage collector is thrashing and thereby slowing down your program.
In my research work with GroupLens, I do a most of my coding for data processing, algorithm implementation, etc. in OCaml. Sometimes I have to suffer a bit for this when some nice library doesn’t have OCaml bindings, but in general it works out fairly well. And every time I go to do some refactoring, I am reminded why I’m not coding in Python.
One thing I have found, however, is that the default OCaml garbage collector parameters are not very well-suited for much of my work — frequently long-running data processing tasks building and manipulating large, often persistent1 data structures. The program will run somewhat slow (although there usually isn’t anything to compare it against), but more importantly, profiling with gprof
will reveal that my program is spending a substantial amount of its time (~30% or more) in the OCaml garbage collector (if memory serves, frequently in the function caml_gc_major_slice
).