Insights for Action

Configuration management

Configuration management

One thing I have learnt is configuration management. It’s an under-rated aspect of reproducible research. I was quickly experimenting with an R package, but needed to install the sf package on my tinkering machine. That in turn required another R package units. And in turn, units needed some header files, as did sf. As it turns out, units package failed with a very nice error message telling me exactly what I had to do. And I had very had deja vu dealing with sf because I know I’ve handed that before.

sudo apt install libgdal-dev
libudunits2-dev

One moral of the story is to remember not to try to do things quickly like this. The interesting thing is that this used to be the only way I worked. Cue lots of frustration when one package got updated and broke some other code.

It turns out that there is now a CRAN task view dedicated to repdroducible research which has a section on package reproducibility. This includes links to packages such as R bundler which attempts to tame your package requirements on a project by project basis.

However, since having machines with enough RAM, I’ve found it very nice to use Virtual Machines during development. I used to use VirtualBox a long time ago (it let me run Linux on a MacBook). But you can use it with tools such as Vagrant to provision a virtual machine. And vagrant can in turn call an ansible script to provision this virtual machine. Voila, tinker away, break everything, start again.

install.packages("XML", repos = "http://www.omegahat.net/R")
Share on: