Tweag developed rules_nixpkgs to empower Bazel users with the ability to leverage
Nix’s reproducible builds and its extensive package registry. That ruleset has proven to be
especially advantageous in endeavors demanding intricate dependency administration and the
maintenance of uniform build environments.
However, rules_nixpkgs is incompatible with remote execution. This is a major limitation given that remote
execution is possibly the main reason why people switch to Bazel. And that rules_nixpkgs provides a great way to configure hermetic toolchains, which are an important ingredient for reliable remote execution. There is no trivial fix as
can be seen in the related, longstanding open issue. At Tweag we
investigated a promising solution presented at Bazel eXchange 2022 (recording), but these ideas
were never implemented in a public proof of concept.
In this post, we will present our new remote execution infrastructure repo and walk you
through the required steps to comprehend and replicate how it achieves remote execution with
rules_nixpkgs.
The remote execution limitation
When we make use of rules_nixpkgs, we instruct Bazel to use packages from nixpkgs
rather than those from the host system. This means that when we try to build a C++ project, Bazel won’t use the
gcc compiler, which is typically found under /usr/bin, but instead will use the compiler specified
by rules_nixpkgs and provided by Nix, typically stored under some /nix/store/<unique_hash>-gcc/bin directory.
Bazel distinguishes actions to import external dependencies from regular build actions. The former are always executed locally1, while the latter can be distributed using remote execution. rules_nixpkgs falls into the former category and invokes Nix to download and install the required /nix/store/<unique_hash>-gcc path locally on your machine.
This scenario works fine when we’re building locally. However, when we enable remote execution, rules_nixpkgs still installs dependencies locally, while the build happens on another machine, which
will not have those paths available, so it will inevitably fail.
Initial setup with remote execution
For our proof of concept, we decided to use Buildbarn to provide the remote execution
endpoint and infrastructure. Buildbarn provides Kubernetes manifests that we can use to deploy all
the necessary Buildbarn components for remote execution to work. We’ll be using the examples from
the bb-deployments repository to test our setup, but also modifying it to make
use of rules_nixpkgs.
To replicate our implementation you’ll need a working Buildbarn infrastructure, which in this case would be a Kubernetes cluster. You can use our guide to set up a cluster on AWS.
Test remote execution without rules_nixpkgs
To make sure that everything is working as expected, we’ll use the @abseil-hello Bazel target
which is available in the Buildbarn deployments repo. This example does not use
rules_nixpkgs, yet. You can clone the bb-deployments repository, if you want to follow
along.
- Get the service endpoint of the Buildbarn executor service (frontend). If you’re deploying on a cloud provider this would be a load-balancer.
$ kubectl get services -n buildbarn
NAME        TYPE           CLUSTER-IP      EXTERNAL-IP                         PORT(S)                      AGE
browser     ClusterIP      172.20.22.171   <none>                              7984/TCP                     8d
frontend    LoadBalancer   172.20.126.97   xxxxx.us-east-1.elb.amazonaws.com   8980:31657/TCP               8d
scheduler   ClusterIP      172.20.83.110   <none>                              8982/TCP,8983/TCP,7982/TCP   8d
storage     ClusterIP      None            <none>                              8981/TCP                     8d- Update .bazelrcto use the remote executor endpoint of our environment
...
build:remote-exec --remote_executor=grpc://[endpoint-from-previous-step]
...Now we can try building the @abseil-hello target using the remote execution infrastructure. Note that we’ll
be using a custom toolchain specific to the default executors created by Buildbarn.
bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_mainTest remote execution with rules_nixpkgs
Once we have validated that our setup works we can create a new target that uses rules_nixpkgs.
Update .bazelversion to use 6.4 which is a version supported by rules_nixpkgs (any other
version on the 6.x should work as well).
Update the WORKSPACE file with the following:
http_archive(
    name = "io_tweag_rules_nixpkgs",
    strip_prefix = "rules_nixpkgs-244ae504d3f25534f6d3877ede4ee50e744a5234",
    urls = ["https://github.com/tweag/rules_nixpkgs/archive/244ae504d3f25534f6d3877ede4ee50e744a5234.tar.gz"],
)
load("@io_tweag_rules_nixpkgs//nixpkgs:repositories.bzl", "rules_nixpkgs_dependencies")
rules_nixpkgs_dependencies()
load("@io_tweag_rules_nixpkgs//nixpkgs:nixpkgs.bzl", "nixpkgs_git_repository", "nixpkgs_package", "nixpkgs_cc_configure")
load("@io_tweag_rules_nixpkgs//nixpkgs:toolchains/go.bzl", "nixpkgs_go_configure") # optional
nixpkgs_git_repository(
    name = "nixpkgs",
    revision = "23.11",
)
nixpkgs_cc_configure(
  repository = "@nixpkgs",
  name = "nixpkgs_config_cc",
  attribute_path = "clang",
)This is the standard boilerplate to install rules_nixpkgs on our Bazel workspace. We’re also
creating a reference to the nixpkgs repository, and a C++ toolchain using clang.
Next, we create a new cc_binary target in BUILD.bazel with a simple hello-world program.
$ cat BUILD.bazel
...
cc_binary(
    name = "hello-world",
    srcs = ["hello-world.cc"],
)
$ cat hello-world.cc
#include <iostream>
int main(int argc, char** argv) {
  std::cout << "Hello world!" << std::endl;
  return 0;
}Now we need to update the custom Buildbarn toolchain used by the executors to reference
@nixpkgs_config_cc. Update the file tools/remote-toolchains/BUILD.bazel and replace the instances
of @remote_config_cc with @nixpkgs_config_cc.
We can try building the application using the C++ toolchain we defined with rules_nixpkgs. We expect
this to fail because the executors are not Nix-aware yet.
$ bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_main
...
ERROR: /home/user/.cache/bazel/_bazel_user/5ce2ca33a49034ed7557e24d70204ce5/external/com_google_absl/absl/base/BUILD.bazel:324:11: Compiling absl/base/internal/throw_delegate.cc failed: (Exit 34): Remote Execution Failure:
Invalid Argument: Failed to run command: Failed to start process: fork/exec /nix/store/n37gxbg343hxin3wdryx092mz2dkafy8-clang-wrapper-16.0.6/bin/cc: no such file or directory
...Because the executors don’t have the /nix/store available, they cannot resolve the compiler path
which is generated locally on our machine when we invoke bazel build.
Now let’s see how we can solve this problem by configuring the executors to access a shared
/nix/store via NFS.
NFS-based solution
Our solution involves a Nix server that bridges this gap. This server manages and synchronizes the Nix dependencies across the Bazel build environment.
Here’s how it works:
- 
During bazel buildtherules_nixpkgsrepository rules will build and copy any Nix derivation to the remote Nix server.
- 
The Nix server will export the /nix/storedirectory tree via a read-only NFS mount share to the executors.
- 
When a build is triggered, all necessary dependencies are already available on the executors, allowing for the build process to continue. 
Implementation-wise, we’ll need to make the following changes to the Buildbarn infrastructure:
- 
A Nix server. This could be a VM with Nix installed that is exporting the /nix/storedirectory as a read-only NFS share over the private network. We’ll need SSH access on that server from the machine that invokesbazel build.
- 
Kubernetes executors with the exported NFS share mounted. 
For a detailed setup guide and implementation specifics, refer to our infrastructure repository.
To instruct rules_nixpkgs to copy the nix derivations to the server we’ll need to create
an entry in our SSH config (typically found under ~/.ssh/config) with the remote server and then
set the environment variable BAZEL_NIX_REMOTE with the name of that entry.
# SSH Configuration
$ cat ~/.ssh/config
Host nix-server
  Hostname [public-ip]
  IdentityFile [ssh-private-key]
  Port [ssh-port]
  User [ssh-user]Testing out remote execution again
With the new setup, we can try building the project again.
$ export BAZEL_NIX_REMOTE=nix-server
$ bazel clean --expunge # To refetch the Nix derivations
$ bazel build --config=remote-ubuntu-22-04 @abseil-hello//:hello_mainYou should now see lines like the following, confirming communication with the Nix server
...
Analyzing: target @abseil-hello//:hello_main (0 packages loaded, 0 targets configured)
    Fetching repository @nixpkgs_config_cc_info; Remote-building Nix derivation 9s
...And the build should be successful.
Conclusion
In this post, we explored the challenges and our solution for integrating rules_nixpkgs with remote
execution in Bazel. Of course this solution is not perfect and it comes with some shortcomings that end
user should be aware of.
- 
The first issue is about cache eviction. Caching all the Nix paths over the long term is not practical from a storage standpoint. That’s why we need a way to mark the required paths, and garbage collect the others. A Nix path should be available as long as a client may trigger a remote build that uses it. However, there’s no way to determine when a client no longer needs a specific path. A simple solution will be to invalidate the least used paths. That will require a tighter integration with the Bazel APIs in order to track the Nix path usage. 
- 
The second issue relates to NFS performance. This depends on the infrastructure and workloads in operation. At least we want to tune the NFS synchronization to the point that the paths are available before any build begins. Slow synchronization between the NFS server and client can lead to failed builds. 
- Bazel has an experimental feature that enables remotable repository rule actions. However, their capabilities are too limited to support the rules_nixpkgsuse-case.↩
Behind the scenes
An SRE/DevOps engineer with a keen interest in networking, infrastructure and build systems.
Guillaume has a background in computer science, engineering and applied mathematics. Regarding software systems, his main concern is correctness, reliability, and trustworthiness. He is passionate about understanding complex systems and untangling intricate issues.
If you enjoyed this article, you might be interested in joining the Tweag team.
