One thing I have learnt is configuration management. It’s an under-rated aspect of reproducible research. I was quickly experimenting with an R package, but needed to install the
sf package on my tinkering machine. That in turn required another R package
units. And in turn,
units needed some header files, as did
sf. As it turns out,
units package failed with a very nice error message telling me exactly what I had to do. And I had very had deja vu dealing with
sf because I know I’ve handed that before.
sudo apt install libgdal-dev libudunits2-dev
One moral of the story is to remember not to try to do things quickly like this. The interesting thing is that this used to be the only way I worked. Cue lots of frustration when one package got updated and broke some other code.
It turns out that there is now a CRAN task view dedicated to repdroducible research which has a section on package reproducibility. This includes links to packages such as R bundler which attempts to tame your package requirements on a project by project basis.
However, since having machines with enough RAM, I’ve found it very nice to use Virtual Machines during development. I used to use VirtualBox a long time ago (it let me run Linux on a MacBook). But you can use it with tools such as Vagrant to provision a virtual machine. And vagrant can in turn call an ansible script to provision this virtual machine. Voila, tinker away, break everything, start again.
install.packages("XML", repos = "http://www.omegahat.net/R")