Monkeying around with nix for HPC systems which have no root access and NFS filesystems.

Background

Nix is not well known for being friendly to users without root access. This is typically made worse by the “exotic” filesystem attributes common to HPC networks (this also plagues hermes). An earlier post details how and why proot failed. The short pitch is simply:

Figure 1: Does your HPC look like this?

Figure 1: Does your HPC look like this?

Figure 2: It really is an HPC

Figure 2: It really is an HPC

If your HPC doesn’t look that swanky and you’d like it to, then read on. Note that there are all the obvious benefits of nix as well, but this is a more eye-catchy pitch.

Setup

The basic concept is to install nix from source, with appropriate patches, and then mess around with paths until it is ready and willing to work with stores which are not /nix 1

This concept is strongly influenced by the work described in this repo. The premise is similar to my earlier post on HPC Dotfiles. For the purposes of this post, we will assume that all the packages in the previous post exist. lmod is not required, feel free to use an alternative path management system, or even just $HOME/.local but if lmod is present, it is highly recommended 2. We will need the following:

Pinned set of nixpkgs
We would like to be able to modify a lot of paths, which is normally a bad practice, but then we don’t normally rebuild all packages either. Grab a copy of the nixpkgs by following the instructions below. Now is also the time to fork the repo if you’d like to keep track of your changes.
1mkdir -p $HOME/Git/Github
2cd $HOME/Git/Github
3git clone https://github.com/NixOS/nixpkgs
dotgit
We use the older, bash version of the excellent dotgit since python is not always present in HPC environments.
1git clone https://github.com/kobus-v-schoor/dotgit/
2mkdir -p $HOME/.local/bin
3cp dotgit/old/bin/bash_completion dotgit/old/bin/dotgit dotgit/old/bin/dotgit_headers dotgit/old/bin/fish_completion.fish $HOME/.local/bin/ -r
lmod packages
If you do not or cannot use modulefiles as described in the earlier post, inspect the module-files being loaded and set paths accordingly.
1cd $HOME/Git/Github
2git clone https://github.com/HaoZeke/hzHPC_lmod
3cd hzHPC_lmod
4$HOME/.local/bin/dotgit restore hzhpc

Now we can start by obtaining the nix sources.

 1myprefix=$HOME/.hpc/nix/nix-boot
 2nixdir=$HOME/.nix
 3nix_version=2.3.7
 4ml load gcc/9.2.0 flex bison
 5ml load boost
 6ml load editline
 7ml load brotli/1.0.1
 8ml load libseccomp/2.4.4
 9ml load bdwgc/8.0.4
10ml load bzip2/1.0.8
11ml load sqlite
12ml load patch xz
13wget http://nixos.org/releases/nix/nix-${nix_version}/nix-${nix_version}.tar.bz2
14tar xfv nix-2.3.7.tar.bz2
15cd nix-2.3.7

Before actually configuring and installing from source, we need some patches.

Patches

I suggest carefully typing out the patches, though leave a comment if you want a repo with these changes (if you must star something in the meantime, star this).

1wget https://github.com/NixOS/nix/commit/8d3cb66d22f348341d7afa626acfa53b40584fdd.patch
2git apply 8d3cb66d22f348341d7afa626acfa53b40584fdd.patch

Remove the following ifdef stuff from src/libutil/compression.cc, leaving only the contents of the else statement.

 1#ifdef HAVE_LZMA_MT
 2            lzma_mt mt_options = {};
 3            mt_options.flags = 0;
 4            mt_options.timeout = 300; // Using the same setting as the xz cmd line
 5            mt_options.preset = LZMA_PRESET_DEFAULT;
 6            mt_options.filters = NULL;
 7            mt_options.check = LZMA_CHECK_CRC64;
 8            mt_options.threads = lzma_cputhreads();
 9            mt_options.block_size = 0;
10            if (mt_options.threads == 0)
11                mt_options.threads = 1;
12            // FIXME: maybe use lzma_stream_encoder_mt_memusage() to control the
13            // number of threads.
14            ret = lzma_stream_encoder_mt(&strm, &mt_options);
15            done = true;
16#else
17            printMsg(lvlError, "warning: parallel XZ compression requested but not supported, falling back to single-threaded compression");
18#endif

If there is trouble with the bzip2 library, set $HOME/.hpc/bzip2/1.0.8/include/bzlib.h in src/libutil/compression.cc, but expand $HOME.

Finally, you will need edit nixpkgs.

1# vim pkgs/os-specific/linux/busybox/default.nix
2  debianName = "busybox_1.30.1-6";
3  debianTarball = fetchzip {
4    url = "http://deb.debian.org/debian/pool/main/b/busybox/${debianName}.debian.tar.xz";
5    sha256 = "05n6mxc8n4zsli4dijrr2x5c9ggwi223i5za4n0xwhgd4lkhqymw";
6  };

User Build

We can now complete the build.

1ml load openssl curl
2./configure  --enable-gc --prefix=$myprefix --with-store-dir=$nixdir/store --localstatedir=$nixdir/var --with-boost=$BOOST_ROOT --disable-seccomp-sandboxing --disable-doc-gen --with-sandbox-shell=/usr/bin/sh CPPFLAGS="-I$HOME/.hpc/bzip2/1.0.8/include" LDFLAGS="-L$HOME/.hpc/bzip2/1.0.8/lib -Wl,-R$HOME/.hpc/bzip2/1.0.8/lib"
3make -j $(nproc)
4make install
5ml load nix/user # Hooray!
6ml unload openssl curl

Now we still need to set a profile. Inspect .hpc/nix/nix-boot/etc/profile.d/nix.sh and check the value of NIX_PROFILES

1chmod +x .hpc/nix/nix-boot/etc/profile.d/nix.sh
2./.hpc/nix/nix-boot/etc/profile.d/nix.sh
3# OR, and this is better
4nix-env --switch-profile .nix/var/nix/profiles/default
5mkdir -p  ~/.nix/var/nix/profiles

The reason why we need to switch profiles is because by default nix-env --switch-profile will use /nix/var/nix/profiles/per-user/$USER/profile in a multi-user setup, and it is better to keep this where we have control as well.

We also need to kill the sandbox for now, as also seen in the AUR package (and here).

1# ~/.config/nix/nix.conf
2sandbox = false
3substituters = https://cache.nixos.org https://all-hies.cachix.org
4trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= all-hies.cachix.org-1:JjrzAOEUsD9ZMt8fdFbzo3jNAyEWlPAwdVuHw4RD43k=

Now we can test this before moving forward:

1nix-channel --update
2nix-shell -p hello

Rebuilding Natively

The astute reader will have noticed that we glibly monkeyed around with the nix source in the previous section, but all will be made well since we can rebuild to use nix with itself. Do replace the variable with the corresponding path:

1storeDir = "$HOME/.nix/store";
2stateDir = "$HOME/.nix/var";
3confDif = "$HOME/.nix/etc";

Essentially, the $HOME/.config/nixpkgs/config.nix should look like (incorporating both the patches and also the full directory we will be using):

 1{
 2  packageOverrides = pkgs:
 3    with pkgs; {
 4      autogen = autogen.overrideAttrs (oldAttrs: {
 5        postInstall = ''
 6          mkdir -p $dev/bin
 7          mv $bin/bin/autoopts-config $dev/bin
 8          for f in $lib/lib/autogen/tpl-config.tlib $out/share/autogen/tpl-config.tlib; do
 9            sed -e "s|$dev/include|/no-such-autogen-include-path|" -i $f
10            sed -e "s|$bin/bin|/no-such-autogen-bin-path|" -i $f
11            sed -e "s|$lib/lib|/no-such-autogen-lib-path|" -i $f
12          done
13          # remove /tmp/** from RPATHs
14          for f in "$bin"/bin/*; do
15            local nrp="$(patchelf --print-rpath "$f" | sed -E 's@(:|^)/tmp/[^:]*:@\1@g')"
16            patchelf --set-rpath "$nrp" "$f"
17          done
18        '' + stdenv.lib.optionalString (!stdenv.hostPlatform.isDarwin) ''
19          # remove /build/** from RPATHs
20          for f in "$bin"/bin/*; do
21            local nrp="$(patchelf --print-rpath "$f" | sed -E 's@(:|^)/build/[^:]*:@\1@g')"
22            patchelf --set-rpath "$nrp" "$f"
23          done
24        '';
25      });
26      nix = nix.overrideAttrs (oldAttrs: {
27        storeDir = "/users/home/jdoe/.nix/store";
28        stateDir = "/users/home/jdoe/.nix/var";
29        confDif = "/users/home/jdoe/.nix/etc";
30        doCheck = false;
31        doInstallCheck = false;
32        prePatch = ''
33          substituteInPlace src/libstore/local-store.cc \
34            --replace '(eaName == "security.selinux")' \
35                      '(eaName == "security.selinux" || eaName == "system.nfs4_acl")'
36          substituteInPlace src/libstore/gc.cc \
37            --replace 'auto mapLines =' \
38                      'continue; auto mapLines ='
39          substituteInPlace src/libstore/sqlite.cc \
40            --replace 'SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, 0) != SQLITE_OK)' \
41                      'SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE, "unix-dotfile") != SQLITE_OK)'
42        '';
43      });
44    };
45}

We can “speed up” our build by disabling all tests. Go to the copy of nixpkgs and run:

1find pkgs  -type f -name 'default.nix' | xargs sed -i 's/doCheck = true/doCheck = false/'
1mkdir -p $HOME/.nix/var/nix/profiles/
2nix-env -i nix -f $HOME/Git/Github/nixpkgs -j$(nproc) --keep-going --show-trace -v --cores 4 2>&1 | tee nix-no-root.log
3ml load nix/bootstrapped

This will still take a couple of hours at least. Around 3-4 hours. Try to set this up on a lazy weekend to evade sysadmins.

If curl 429 rate limits are encountered for musl sources, the solution is to replace the source (put the following in a file, say no429.patch):

 1diff --git a/pkgs/os-specific/linux/musl/default.nix b/pkgs/os-specific/linux/musl/default.nix
 2index ae175a3..1a6f6c7 100644
 3--- a/pkgs/os-specific/linux/musl/default.nix
 4+++ b/pkgs/os-specific/linux/musl/default.nix
 5@@ -4,12 +4,12 @@
 6 }:
 7 let
 8   cdefs_h = fetchurl {
 9-    url = "http://git.alpinelinux.org/cgit/aports/plain/main/libc-dev/sys-cdefs.h";
10+    url = "https://raw.githubusercontent.com/akadata/aports/master/main/libc-dev/sys-cdefs.h";
11     sha256 = "16l3dqnfq0f20rzbkhc38v74nqcsh9n3f343bpczqq8b1rz6vfrh";
12   };
13   queue_h = fetchurl {
14-    url = "http://git.alpinelinux.org/cgit/aports/plain/main/libc-dev/sys-queue.h";
15-    sha256 = "12qm82id7zys92a1qh2l1qf2wqgq6jr4qlbjmqyfffz3s3nhfd61";
16+    url = "https://raw.githubusercontent.com/akadata/aports/master/main/libc-dev/sys-queue.h";
17+    sha256 = "049pd547ckrsky72s18a649mz660yph14wdrlw9gnbk903skdnz4";
18   };
19   tree_h = fetchurl {
20     url = "http://git.alpinelinux.org/cgit/aports/plain/main/libc-dev/sys-tree.h";

This can be applied with git apply.

Usage

We have finally obtained a bootstrapped nix which is bound to our set of nixpkgs. To ensure its use:

1ml use $HOME/Modulefiles
2ml purge
3ml load nix/bootstrapped
4ml save

Flakes and DevShells

Newer versions of nix depend on mdbook which is meant for generating the documentation. Unfortunately, the cargo256 hashes are path dependent. A quick fix is to remove the dependency on mdbook and disable documentation generation with the following ugly patch:

 1diff --git a/pkgs/tools/package-management/nix/default.nix b/pkgs/tools/package-management/nix/default.nix
 2index 7eda5ae..91bf1b8 100644
 3--- a/pkgs/tools/package-management/nix/default.nix
 4+++ b/pkgs/tools/package-management/nix/default.nix
 5@@ -14,7 +14,7 @@ common =
 6   , pkg-config, boehmgc, libsodium, brotli, boost, editline, nlohmann_json
 7   , autoreconfHook, autoconf-archive, bison, flex
 8   , jq, libarchive, libcpuid
 9-  , lowdown, mdbook
10+  , lowdown
11   # Used by tests
12   , gtest
13   , busybox-sandbox-shell
14@@ -36,7 +36,7 @@ common =
15
16       VERSION_SUFFIX = suffix;
17
18-      outputs = [ "out" "dev" "man" "doc" ];
19+      outputs = [ "out" "dev" ];
20
21       nativeBuildInputs =
22         [ pkg-config ]
23@@ -45,7 +45,6 @@ common =
24           [ autoreconfHook
25             autoconf-archive
26             bison flex
27-            (lib.getBin lowdown) mdbook
28             jq
29            ];
30
31@@ -119,8 +118,8 @@ common =
32         [ "--with-store-dir=${storeDir}"
33           "--localstatedir=${stateDir}"
34           "--sysconfdir=${confDir}"
35-          "--disable-init-state"
36           "--enable-gc"
37+          "--disable-doc-gen"
38         ]
39         ++ lib.optionals stdenv.isLinux [
40           "--with-sandbox-shell=${sh}/bin/busybox"
41@@ -136,7 +135,8 @@ common =
42
43       installFlags = [ "sysconfdir=$(out)/etc" ];
44
45-      doInstallCheck = true; # not cross
46+      doInstallCheck = false; # not cross
47+      doCheck = false;
48
49       # socket path becomes too long otherwise
50       preInstallCheck = lib.optionalString stdenv.isDarwin ''
51@@ -160,7 +160,7 @@ common =
52         license = lib.licenses.lgpl2Plus;
53         maintainers = [ lib.maintainers.eelco ];
54         platforms = lib.platforms.unix;
55-        outputsToInstall = [ "out" "man" ];
56+        outputsToInstall = [ "out" ];
57       };
58
59       passthru = {

We also need to update our config.nix:

 1nixUnstable = nixUnstable.overrideAttrs (oldAttrs: {
 2    storeDir = "/users/home/rog32/.nix/store";
 3    stateDir = "/users/home/rog32/.nix/var";
 4    confDif = "/users/home/rog32/.nix/etc";
 5    doCheck = false;
 6    doInstallCheck = false;
 7    prePatch = ''
 8      substituteInPlace src/libstore/local-store.cc \
 9        --replace '(eaName == "security.selinux")' \
10                  '(eaName == "security.selinux" || eaName == "system.nfs4_acl")'
11      substituteInPlace src/libstore/gc.cc \
12        --replace 'auto mapLines =' \
13                  'continue; auto mapLines ='
14     '';
15   });

Now we can finally get to the installation of a newer version. I prefer to live life on the edge:

1nix-env -iA nixUnstable -f $HOME/Git/Github/nixpkgs -j$(nproc) --keep-going --show-trace --cores 4 2>&1 | tee nix-install-base.log

We are now able activate flakes and other features like nix shell (note the space!).

1# ~/.config/nix/nix.conf
2experimental-features = nix-command flakes

Bonus: Fixing Documentation

In order to get the original derivation working, we need to essentially modify the cargo256 hashes. Thankfully the nix build log is rather verbose.

1installing
2error: hash mismatch in fixed-output derivation '/users/home/jdoen/.nix/sto$
3e/n71nkimlbazmq1vpyyavqcxzg9c86brs-mdbook-0.4.7-vendor.tar.gz.drv':
4         specified: sha256-2kBJcImytsSd7Q0kj1bsP/NXxyy2Pr8gHb8iNf6h3/4=
5            got:    sha256-4bYLrmyI7cPUes6DYREiIB9gDze0KO2jMP/jPzvWbwQ=
6error: 1 dependencies of derivation '/users/home/jdoen/.nix/store/wr31pgva8a
7zn9jvvpa4bshykv80xf5qi-mdbook-0.4.7.drv' failed to build
8error: 1 dependencies of derivation '/users/home/jdoen/.nix/store/y8pkc0hhgz
9rvxgrj7c00mmsy50plya6p-nix-2.4pre20210326_dd77f71.drv' failed to build

We need to modify pkgs/tools/text/mdbook/default.nix to update the hash; and then:

1nix-env -iA nixUnstable -f $HOME/Git/Github/nixpkgs -j$(nproc) --keep-going --show-trace --cores 4 2>&1 | tee nix-install-base.log
2ml load nix/bootstrapped
3nix-shell --help # Works
4nix-shell -p hello # Also works

Channels

We would like to move away from having to constantly pass our cloned set of packages.

1nix-channel --add https://nixos.org/channels/nixpkgs-unstable nixpkgs
2nix-channel --update

Basic Packages

Now we can get some basic stuff too.

1nix-env -i tmux zsh lsof pv git -f $HOME/Git/Github/nixpkgs -j$(nproc) --keep-going --show-trace --cores 4 2>&1 | tee nix-install-base.log

Ruby Caveats

No longer relevant as of April 2020

While installing packages which depend on ruby, there will be permission errors inside the build folder. These can be “fixed” by setting very permissive controls on the build-directory in question. Do not set permissions directly on the .nix/store/$HASH folder, as doing so will make nix reject the build artifact.

1# neovim depends on ruby
2nix-env -i neovim -v -f $HOME/Git/Github/nixpkgs

A more elegant way to fix permissions involves a slightly more convoluted approach. We can note where the build is occurring (e.g. /tmp) and run a watch command to fix permissions.

1watch -n1 -x chmod 777 -R /tmp/nix-build-ruby-2.6.6.drv-0/source/lib/

Naturally this must be run in a separate window.

Dotfiles

Feel free to set up dotfiles (mine, perhaps) to profit even further. We will consider the process of obtaining my set below. Minimally, we will want to obtain tmux and zsh.

1nix-env -i tmux zsh -v -f $HOME/Git/Github/nixpkgs

Now we can set the dotfiles up.

1git clone https://github.com/HaoZeke/Dotfiles
2cd Dotfiles
3$HOME/.local/bin/dotgit restore hzhpc

The final installation configures neovim and tmux.

1zsh
2# Should install things with zinit
3tmux
4# CTRL+b --> SHIFT+I to install
5nvim

Misc NFS

For issues concerning NFS lock files, consider simply moving the problematic file and let things sort themselves out. Consider:

1nix-build
2# something about a .nfs lockfile in some .nix/$HASH-pkg/.nfs0234234
3mv .nix/$HASH-pkg/ .diePKGs/
4nix-build # profit

The right way to deal with this is of course:

1nix-build
2lsof +D .nix/$HASH-pkg/.nfs0234234
3kill $whatever_blocks
4nix-build # profit

Conclusions

Though this is slow and seems like an inefficient use of cluster resources, the benefits of reproducible environments typically outweighs the cost. Also it is much more pleasant to have a proper package manager which can work with Dotfiles.


  1. Note that this will of course entail rebuilding everything from scratch, every time, which means no binary caches. Thus there is no reasonable defence for trying this out without access to a high powered limited access machine ↩︎

  2. The rest of the post assumes we are on the same page and working towards the same end-goal, substitute and remix at will ↩︎