Monkeying around with nix for HPC systems which have no root access and NFS filesystems.

Background

Nix is not well known for being friendly to users without root access. This is typically made worse by the “exotic” filesystem attributes common to HPC networks (this also plagues hermes). An earlier post details how and why proot failed. The short pitch is simply:

Figure 1: Does your HPC look like this?

Figure 1: Does your HPC look like this?

Figure 2: It really is an HPC

Figure 2: It really is an HPC

If your HPC doesn’t look that swanky and you’d like it to, then read on. Note that there are all the obvious benefits of nix as well, but this is a more eye-catchy pitch.

Setup

The basic concept is to install nix from source, with appropriate patches, and then mess around with paths until it is ready and willing to work with stores which are not /nix 1

This concept is strongly influenced by the work described in this repo. The premise is similar to my earlier post on HPC Dotfiles. For the purposes of this post, we will assume that all the packages in the previous post exist. lmod is not required, feel free to use an alternative path management system, or even just $HOME/.local but if lmod is present, it is highly recommended 2. We will need the following:

Pinned set of nixpkgs
We would like to be able to modify a lot of paths, which is normally a bad practice, but then we don’t normally rebuild all packages either. Grab a copy of the nixpkgs by following the instructions below. Now is also the time to fork the repo if you’d like to keep track of your changes.
mkdir -p $HOME/Git/Github
cd $HOME/Git/Github
git clone https://github.com/NixOS/nixpkgs
dotgit
We use the older, bash version of the excellent dotgit since python is not always present in HPC environments.
git clone https://github.com/kobus-v-schoor/dotgit/
mkdir -p $HOME/.local/bin
cp dotgit/old/bin/bash_completion dotgit/old/bin/dotgit dotgit/old/bin/dotgit_headers dotgit/old/bin/fish_completion.fish $HOME/.local/bin/ -r
lmod packages
If you do not or cannot use modulefiles as described in the earlier post, inspect the module-files being loaded and set paths accordingly.
cd $HOME/Git/Github
git clone https://github.com/HaoZeke/hzHPC_lmod
cd hzHPC_lmod
$HOME/.local/bin/dotgit restore hzhpc

Now we can start by obtaining the nix sources.

myprefix=$HOME/.hpc/nix/nix-boot
nixdir=$HOME/.nix
nix_version=2.3.7
ml load gcc/9.2.0 flex bison
ml load boost
ml load editline
ml load brotli/1.0.1
ml load libseccomp/2.4.4
ml load bdwgc/8.0.4
ml load bzip2/1.0.8
ml load openssl sqlite
wget http://nixos.org/releases/nix/nix-${nix_version}/nix-${nix_version}.tar.bz2
tar xfv nix-2.3.7.tar.bz2
cd nix-2.3.7

Before actually configuring and installing from source, we need some patches.

Patches

I suggest carefully typing out the patches, though leave a comment if you want a repo with these changes (if you must star something in the meantime, star this).

Remove the following ifdef stuff from src/libutil/compression.cc, leaving only the contents of the else statement.

#ifdef HAVE_LZMA_MT
            lzma_mt mt_options = {};
            mt_options.flags = 0;
            mt_options.timeout = 300; // Using the same setting as the xz cmd line
            mt_options.preset = LZMA_PRESET_DEFAULT;
            mt_options.filters = NULL;
            mt_options.check = LZMA_CHECK_CRC64;
            mt_options.threads = lzma_cputhreads();
            mt_options.block_size = 0;
            if (mt_options.threads == 0)
                mt_options.threads = 1;
            // FIXME: maybe use lzma_stream_encoder_mt_memusage() to control the
            // number of threads.
            ret = lzma_stream_encoder_mt(&strm, &mt_options);
            done = true;
#else
            printMsg(lvlError, "warning: parallel XZ compression requested but not supported, falling back to single-threaded compression");
#endif

If there is trouble with the bzip2 library, set $HOME/.hpc/bzip2/1.0.8/include/bzlib.h in src/libutil/compression.cc, but expand $HOME.

Finally, you will need edit nixpkgs.

# vim pkgs/os-specific/linux/busybox/default.nix
  debianName = "busybox_1.30.1-6";
  debianTarball = fetchzip {
    url = "http://deb.debian.org/debian/pool/main/b/busybox/${debianName}.debian.tar.xz";
    sha256 = "05n6mxc8n4zsli4dijrr2x5c9ggwi223i5za4n0xwhgd4lkhqymw";
  };

User Build

We can now complete the build.

./configure  --enable-gc --prefix=$myprefix --with-store-dir=$nixdir/store --localstatedir=$nixdir/var --with-boost=$BOOST_ROOT --disable-seccomp-sandboxing --disable-doc-gen CPPFLAGS="-I$HOME/.hpc/bzip2/1.0.8/include" LDFLAGS="-L$HOME/.hpc/bzip2/1.0.8/lib -Wl,-R$HOME/.hpc/bzip2/1.0.8/lib"
make -j $(nproc)
make install
ml load nix/user # Hooray!

Now we still need to set a profile. Inspect .hpc/nix/nix-boot/etc/profile.d/nix.sh and check the value of NIX_PROFILES

chmod +x .hpc/nix/nix-boot/etc/profile.d/nix.sh
./.hpc/nix/nix-boot/etc/profile.d/nix.sh
# OR, and this is better
nix-env --switch-profile .nix/var/nix/profiles/default
mkdir -p  ~/.nix/var/nix/profiles

Rebuilding Natively

The astute reader will have noticed that we glibly monkeyed around with the nix source in the previous section, but all will be made well since we can rebuild to use nix with itself. Do replace the variable with the corresponding path:

storeDir = "$HOME/.nix/store";
stateDir = "$HOME/.nix/var";
confDif = "$HOME/.nix/etc";

We can “speed up” our build by disabling all tests. Go to the copy of nixpkgs and run:

find pkgs  -type f -name 'default.nix' | xargs sed -i 's/doCheck = true/doCheck = false/'
mkdir -p $HOME/.nix/var/nix/profiles/
nix-env -i nix -f $HOME/Git/Github/nixpkgs -j$(nproc) --keep-going --show-trace -v --cores 4 2>&1 | tee nix-no-root.log
ml load nix/bootstrapped

This will still take a couple of hours at least. Around 3-4 hours. Try to set this up on a lazy weekend to evade sysadmins.

Usage

We have finally obtained a bootstrapped nix which is bound to our set of nixpkgs. To ensure its use:

ml use $HOME/Modulefiles
ml purge
ml load nix/bootstrapped
ml save

Basic Packages

Now we can get some basic stuff too.

nix-env -i tmux zsh lsof pv git -f $HOME/Git/Github/nixpkgs -j$(nproc) --keep-going --show-trace --cores 4 2>&1 | tee nix-install-base.log

Ruby Caveats

While installing packages which depend on ruby, there will be permission errors inside the build folder. These can be “fixed” by setting very permissive controls on the build-directory in question. Do not set permissions directly on the .nix/store/$HASH folder, as doing so will make nix reject the build artifact.

# neovim depends on ruby
nix-env -i neovim -v -f $HOME/Git/Github/nixpkgs

A more elegant way to fix permissions involves a slightly more convoluted approach. We can note where the build is occurring (e.g. /tmp) and run a watch command to fix permissions.

watch -n1 -x chmod 777 -R /tmp/nix-build-ruby-2.6.6.drv-0/source/lib/

Naturally this must be run in a separate window.

Dotfiles

Feel free to set up dotfiles (mine, perhaps) to profit even further. We will consider the process of obtaining my set below. Minimally, we will want to obtain tmux and zsh.

nix-env -i tmux zsh -v -f $HOME/Git/Github/nixpkgs

Now we can set the dotfiles up.

git clone https://github.com/HaoZeke/Dotfiles
cd Dotfiles
$HOME/.local/bin/dotgit restore hzhpc

The final installation configures neovim and tmux.

zsh
# Should install things with zinit
tmux
# CTRL+b --> SHIFT+I to install
nvim

Misc NFS

For issues concerning NFS lock files, consider simply moving the problematic file and let things sort themselves out. Consider:

nix-build
# something about a .nfs lockfile in some .nix/$HASH-pkg/.nfs0234234
mv .nix/$HASH-pkg/ .diePKGs/
nix-build # profit

Conclusions

Though this is slow and seems like an inefficient use of cluster resources, the benefits of reproducible environments typically outweighs the cost. Also it is much more pleasant to have a proper package manager which can work with Dotfiles.


  1. Note that this will of course entail rebuilding everything from scratch, every time, which means no binary caches. Thus there is no reasonable defence for trying this out without access to a high powered limited access machine ↩︎

  2. The rest of the post assumes we are on the same page and working towards the same end-goal, substitute and remix at will ↩︎