This post discusses briefly, the nix-shell environment for reproducible programming. In particular, there is an emphasis on extensions for installing and working with packages not in CRAN, i.e. packages off Github which are normally installed with devtools.

Background

The entire nix ecosystem is fantastic, and is the main packaging system used by d-SEAMS as well. Recently I began working through the excellent second edition of “Statistical Rethinking” by Richard McElreath1.

Unfortunately, the rethinking package which is a major component of the book itself depends on the V8 engine for some reason. The reigning AUR2 package (V8-r) broke with a fun error message I couldn’t be bothered to deal with. Ominously, the rest of the logs prominently featured Warning: Running gclient on Python 3.. Given that older python versions have been permanently retired, this seemed like a bad thing to deal with3. In any case, having weaned off non-nix dependency tools for python and friends, it seemed strange to not do the same for R.

The standard installation for the package entails obtaining rstan (which is trivial with nixpkgs) and then using:

1install.packages(c("coda","mvtnorm","devtools","loo","dagitty"))
2library(devtools)
3devtools::install_github("rmcelreath/rethinking")

We will break this down and work through this installation in Nix space.

Nix and R

The standard approach to setting up a project shell.nix is simply by using the mkshell function. There are some common aspects to this workflow, with more language specific details documented here. A simple first version might be:

 1let
 2    pkgs = import <nixpkgs> { };
 3in pkgs.mkShell {
 4    buildInputs = with pkgs; [
 5        zsh
 6        R
 7        rPackages.ggplot
 8        rPackages.data_table
 9    ];
10    shellHook = ''
11    echo "hello"
12    '';
13  LOCALE_ARCHIVE = stdenv.lib.optionalString stdenv.isLinux
14    "${glibcLocales}/lib/locale/locale-archive";
15}

Where we note that we can install CRAN packages as easily as regular packages (like R), except for the fact that they are kept in a pkgs.rPackages environment, as opposed to pkgs. This is actually a common convention most languages with central repos. The most interesting thing to note is that, similar to the convention for nix-python setups, packages with a dot in the name will be converted to having an underscore, i.e. data.table -> data_table.

However, for the rethinking package, and many others, there is no current CRAN package, and so the rPackages approach fails.

The LOCALE_ARCHIVE needs to be set for Linux machines, and is required for working with other packages.

Nix-R and Devtools

To work with non-CRAN packages, we need to modify our package setup a little. We will also simplify our file to split the pkgs and the r-pkgs.

Naive Approach

The naive approach works by using the shellHook to set R_LIBS_USER to save user packages per-project.

 1{ pkgs ? import <nixpkgs> { } }:
 2with pkgs;
 3let
 4  my-r-pkgs = rWrapper.override {
 5    packages = with rPackages; [
 6      ggplot2
 7      knitr
 8      rstan
 9      tidyverse
10      V8
11      dagitty
12      coda
13      mvtnorm
14      shape
15      Rcpp
16      tidybayes
17    ];
18  };
19in mkShell {
20  buildInputs = = with pkgs;[ git glibcLocales openssl openssh curl wget ];
21  inputsFrom = [ my-r-pkgs ];
22  shellHook = ''
23    mkdir -p "$(pwd)/_libs"
24    export R_LIBS_USER="$(pwd)/_libs"
25  '';
26  GIT_SSL_CAINFO = "${cacert}/etc/ssl/certs/ca-bundle.crt";
27  LOCALE_ARCHIVE = stdenv.lib.optionalString stdenv.isLinux
28    "${glibcLocales}/lib/locale/locale-archive";
29}

Note that here we will also need to set the GIT_SSL_CAINFO to prevent some errors during the build process4.

Native Approach

The native approach essentially leverages the nix method for building R packages. This is the most reproducible of the lot, and also has the useful property of storing the files in the nix-store so re-using packages across different projects will not store, build or download the package again. The values required can be calculated from nix-prefetch-git as follows:

1nix-env -i nix-prefetch-git
2nix-prefetch-git https://github.com/rmcelreath/rethinking.git

The crux of this approach is the following snippet5:

 1(buildRPackage {
 2  name = "rethinking";
 3  src = fetchFromGitHub {
 4    owner = "rmcelreath";
 5    repo = "rethinking";
 6    rev = "d0978c7f8b6329b94efa2014658d750ae12b1fa2";
 7    sha256 = "1qip6x3f6j9lmcmck6sjrj50a5azqfl6rfhp4fdj7ddabpb8n0z0";
 8  };
 9  propagatedBuildInputs = [ coda MASS mvtnorm loo shape rstan dagitty ];
10 })

Project Shell

This formulation for some strange reason does not work from the shell or environment by default, but does work with nix-shell --run bash --pure.

 1{ pkgs ? import <nixpkgs> { } }:
 2with pkgs;
 3let
 4  my-r-pkgs = rWrapper.override {
 5    packages = with rPackages; [
 6      ggplot2
 7      knitr
 8      rstan
 9      tidyverse
10      V8
11      dagitty
12      coda
13      mvtnorm
14      shape
15      Rcpp
16      tidybayes
17      (buildRPackage {
18        name = "rethinking";
19        src = fetchFromGitHub {
20          owner = "rmcelreath";
21          repo = "rethinking";
22          rev = "d0978c7f8b6329b94efa2014658d750ae12b1fa2";
23          sha256 = "1qip6x3f6j9lmcmck6sjrj50a5azqfl6rfhp4fdj7ddabpb8n0z0";
24        };
25        propagatedBuildInputs = [ coda MASS mvtnorm loo shape rstan dagitty ];
26      })
27    ];
28  };
29in mkShell {
30  buildInputs = with pkgs; [ git glibcLocales openssl which openssh curl wget my-r-pkgs ];
31  shellHook = ''
32    mkdir -p "$(pwd)/_libs"
33    export R_LIBS_USER="$(pwd)/_libs"
34    echo ${my-r-pkgs}/bin/R
35  '';
36  GIT_SSL_CAINFO = "${cacert}/etc/ssl/certs/ca-bundle.crt";
37  LOCALE_ARCHIVE = stdenv.lib.optionalString stdenv.isLinux
38    "${glibcLocales}/lib/locale/locale-archive";
39}

The reason behind this is simply that rWrapper forms an extra package which has lower precedence than the user profile R, which is documented in more detail here on the NixOS wiki.

User Profile

This is a more general approach which defines the environment for R with all the relevant libraries and is described in the nixpkgs manual. The following code should be placed in $HOME/.config/nixpkgs/config.nix:

 1{
 2  packageOverrides = super:
 3    let self = super.pkgs;
 4    in {
 5      rEnv = super.rWrapper.override {
 6        packages = with self.rPackages; [
 7          ggplot2
 8          knitr
 9          tidyverse
10          tidybayes
11          (buildRPackage {
12            name = "rethinking";
13            src = self.fetchFromGitHub {
14              owner = "rmcelreath";
15              repo = "rethinking";
16              rev = "d0978c7f8b6329b94efa2014658d750ae12b1fa2";
17              sha256 = "1qip6x3f6j9lmcmck6sjrj50a5azqfl6rfhp4fdj7ddabpb8n0z0";
18            };
19            propagatedBuildInputs =
20              [ coda MASS mvtnorm loo shape rstan dagitty ];
21          })
22        ];
23      };
24    };
25}

This snippet allows us to use our R as follows:

1# Install things
2nix-env -f "<nixpkgs>" -iA rEnv
3# Fix locale
4export LOCALE_ARCHIVE="$(nix-build --no-out-link "<nixpkgs>" -A glibcLocales)/lib/locale/locale-archive"
5# Profit
6R

Note that in this method, on Linux systems, the locale problem has to be fixed with the explicit export. This means that this should be used mostly with project level environments, instead of populating the global shell RC files.

Update: There is another post with methods to reload this configuration automatically

Conclusions

Of the methods described, the most useful method for working with packages not hosted on CRAN is through the user-profile, while the shell.nix method is useful in conjunction, for managing various projects. So the ideal approach is then to use the user profile for installing anything which normally uses devtools and then use shell.nix for the rest.

Note that if the Project Shell is used with a User Profile as described in the next section, all packages defined there can be dropped and then the project shell does not need to execute R by default. The simplified shell.nix is then simply:

 1{ pkgs ? import <nixpkgs> { } }:
 2with pkgs;
 3let
 4  my-r-pkgs = rWrapper.override {
 5    packages = with rPackages; [
 6      ggplot2
 7    ];
 8  };
 9in mkShell {
10  buildInputs = with pkgs;[ git glibcLocales openssl which openssh curl wget my-r-pkgs ];
11  inputsFrom = [ my-r-pkgs ];
12  shellHook = ''
13    mkdir -p "$(pwd)/_libs"
14    export R_LIBS_USER="$(pwd)/_libs"
15  '';
16  GIT_SSL_CAINFO = "${cacert}/etc/ssl/certs/ca-bundle.crt";
17  LOCALE_ARCHIVE = stdenv.lib.optionalString stdenv.isLinux
18    "${glibcLocales}/lib/locale/locale-archive";
19}

The entire workflow for rethinking is continued here.


  1. As part of a summer course at the University of Iceland relating to their successful COVID-19 model ↩︎

  2. The Arch User Repository is the port of first call for most ArchLinux users ↩︎

  3. Though, like any good AUR user, I did post a bug report ↩︎

  4. This approach is also discussed here ↩︎

  5. As discussed on this issue, this stackoverflow question and also seen here ↩︎