NixOS IPv6 Prefix Delegation

Note: This used to be a draft that I had lying around for more than a year now. I’ve decided to just publish it now.

My home internet connection comes with proper dual-stack support. I get a public IPv4 address via DHCP, a /128 (IA_NA) and a /48 IPv6 prefix (IA_PD) via DHCPv6.

Traditionally you would configure your networking via iproute2 and then fork-off a DHCP client to configure the external addresses.

Usually all of that is being hidden from you through wrappers (like Debian’s ifupdown). The configuration would be set, the daemons fired off and hopefully everything would go well.

The limitations of the system become visible once you have a more dynamic set of interfaces that have to be initialized in some order, some VPN device that depend on the uplink connection etc. While systems like ifupdown have employed hooks of all sorts that you still end up writing a bunch of (inlined) shell scripts that deal with some little details of your setup. Adding sleep statements at worst. Things get tricky when one interface going up changes things on another interface or even system wide (think sysctl, iptables, starting a VPN daemon, …).

As soon as your configuration grows (over days, months, years, …) it becomes harder to verify that everything still works after a clean reboot of the system. After all you’ve been modifying the system with each “restart” of the network configuration script/tooling.

I’ve had systems work reliably after a reboot with huge networking scripts on these systems. Was it fun? Was it straightforward? No, not really. You really had to know what you were doing. Having to manually fiddle with some parameters after a reboot wasn’t uncommon. Sometimes services wouldn’t come up properly because the network was not up when they were being scheduled.

Here is where we should start talking about systemd-networkd (or just networkd). networkd allows you to declare all (or really almost everything) that you previously encoded in lines and lines of shell-like configuration files as structured configuration. That doesn’t just include the (static) addresses, routes, VLANs, VRFs, bridges, policy routing, wirguard, … but also runtime configuration like DHCPv4 and DHCPv6 and prefix delegation. You can configure when the system is to be considered “online” and which interfaces (in which states) contribute to that.1

This makes for a very nice property: You now have a single directory of plain text files that define all your network configurations. As long as you have networkd installed and a way to render your /etc/systemd/network configuration you are good. Rolling back a change now just means rolling back that directory. Versioning text files is what we are good at, no?

Ontop of that we can make use of Nix and specifically the NixOS module system to generate the NixOS configuration and also manage the services on our system in the very same way.

And this gets me back to the topic of this post. My residential internet connection and configuring my CPE aka router. I was planning to do the migration from my old venerable Debian system to a shiny new NixOS machine but I didn’t just want to pull the plug and fiddle until everything is seeminlgy working again.

I decided to just simulate a simplified version of the relevant pieces of the ISP infra and test my poential configuration against that. Since the termination just happens on an (isolated) Ethernet L2 all I really worried about was getting the DHCPv6 client configuration right. Especially the prefix delegation. For this blog post I trimmed it down to just use the DHCPv6 prefix delegation part.

This is the high level definition of the test:

let
  makeTest = import (<nixpkgs> + "/nixos/tests/make-test-python.nix");
in
makeTest (
  { pkgs, lib, ... }:
    {
      nodes = {
        # The `isp` node will represent my upstream ISP. It will run a
        # traditional network stack with isc's dhcpd4 and dhcpd6
        isp = {
          virtualisation.vlans = [ 1 ];
          imports = [ ./isp.nix ];
        };

        # My CPE / router that will handle internal as well as external
        # DHCPv{4,6}, RA etc..
        router = {
          virtualisation.vlans = [ 1 2 ];
          # import the custom module that contains all the implementation details
          imports = [ ./router.nix ];
        };

        # A test client that will verify the result
        client = {
          virtualisation.vlans = [ 2 ];
          imports = [ ./client.nix ];
        };
      };

      testScript = builtins.readFile ./test.py;
    }
)

First the makeTest function is being imported from <nixpkgs>. It serves as a way to construct NixOS VM tests from the package set. Each test contains a number of machine definitions and a test script.

Each of the nodes can be in multiple virtual networks. The ISP and the Router node will share one VLAN (1), the Router and the Client node will share another (2). This simulates the missing wires/switches/wireless between the devices.

Each of the nodes imports a single file that contains the detailed configuration.

The next step was filling in the details about how the ISPs side of things operates. I mostly just copied in some files that I had lying around from a previous setup. Here is the commented isp.nix:

{ lib, pkgs, ... }:
{
  # The ISP's routers job is to delegate IPv6 prefixes via DHCPv6. Like with
  # regular IPv6 auto-configuration it will also emit IPv6 router
  # advertisements (RAs). Those RA's will not carry a prefix but in contrast
  # just set the "Other" flag to indicate to the receiving nodes that they
  # should attempt DHCPv6.
  #
  # Note: On the ISPs device we don't really care if we are using networkd in
  # this example. That being said we can't use it (yet) as networkd doesn't
  # implement the serving side of DHCPv6. We will use ISC's well aged dhcpd6
  # for that task.
  networking = {
    useDHCP = false;
    firewall.enable = false;
    interfaces.eth1.ipv4.addresses = lib.mkForce []; # no need for legacy IP
    interfaces.eth1.ipv6.addresses = lib.mkForce [
      { address = "2001:DB8::"; prefixLength = 64; }
    ];
  };

  # Since we want to program the routes that we delegate to the "customer"
  # into our routing table we must have a way to gain the required privs.
  # This security wrapper will do in our test setup.
  #
  # DO NOT COPY THIS TO PRODUCTION AS IS. Think about it at least twice.
  # Everyone on the "isp" machine will be able to add routes to the kernel.
  security.wrappers.add-dhcpd-lease = {
    source = pkgs.writeShellScript "add-dhcpd-lease" ''
      exec ${pkgs.iproute}/bin/ip -6 route replace "$1" via "$2"
    '';
    capabilities = "cap_net_admin+ep";
  };
  services = {
    # Configure the DHCPv6 server
    #
    # We will hand out /48 prefixes from the subnet 2001:DB8:F000::/36.
    # That gives us ~8k prefixes. That should be enough for this test.
    #
    # Since (usually) you will not receive a prefix with the router
    # advertisements we also hand out /128 leases from the range
    # 2001:DB8:0000:0000:FFFF::/112.
    dhcpd6 = {
      enable = true;
      interfaces = [ "eth1" ];
      extraConfig = ''
        subnet6 2001:DB8::/36 {
          range6 2001:DB8:0000:0000:FFFF:: 2001:DB8:0000:0000:FFFF::FFFF;
          prefix6 2001:DB8:F000:: 2001:DB8:FFFF:: /48;
        }

        # This is the secret sauce. We have to extract the prefix and the
        # next hop when commiting the lease to the database.  dhcpd6
        # (rightfully) has not concept of adding routes to the systems
        # routing table. It really depends on the setup.
        #
        # In a production environment your DHCPv6 server is likely not the
        # router. You might want to consider BGP, custom NetConf calls, …
        # in those cases.
        on commit {
          set IP = pick-first-value(binary-to-ascii(16, 16, ":", substring(option dhcp6.ia-na, 16, 16)), "n/a");
          set Prefix = pick-first-value(binary-to-ascii(16, 16, ":", suffix(option dhcp6.ia-pd, 16)), "n/a");
          set PrefixLength = pick-first-value(binary-to-ascii(10, 8, ":", substring(suffix(option dhcp6.ia-pd, 17), 0, 1)), "n/a");
          log(concat(IP, " ", Prefix, " ", PrefixLength));
          execute("/run/wrappers/bin/add-dhcpd-lease", concat(Prefix,"/",PrefixLength), IP);
        }
      '';
    };

    # Finally we have to set up the router advertisements. While we could be
    # using networkd or bird for this task `radvd` is probably the most
    # venerable of them all. It was made explicitly for this purpose and
    # the configuration is much more straightforward than what networkd
    # requires.
    # As outlined above we will have to set the `Managed` flag as otherwise
    # the clients will not know if they should do DHCPv6. (Some do
    # anyway/always)
    radvd = {
      enable = true;
      config = ''
        interface eth1 {
          AdvSendAdvert on;
          AdvManagedFlag on;
          AdvOtherConfigFlag off; # we don't really have DNS or NTP or anything like that to distribute
          prefix ::/64 {
            AdvOnLink on;
            AdvAutonomous on;
          };
        };
      '';
    };
  };
}

Next the router.nix:

# This will be our (residential) router that receives the IPv6 prefix (IA_PD)
# and /128 (IA_NA) allocation.
#
# Here we will actually start using networkd.
{
  systemd.services.systemd-networkd.environment.SYSTEMD_LOG_LEVEL = "debug";

  boot.kernel.sysctl = {
    # we want to forward packets from the ISP to the client and back.
    "net.ipv6.conf.all.forwarding" = 1;
  };

  networking = {
    useNetworkd = true;
    useDHCP = false;
    # Consider enabling this in production and generating firewall rules
    # for fowarding/input from the configured interfaces so you do not have
    # to manage multiple places
    firewall.enable = false;
  };

  systemd.network = {
    networks = {
      # Configuration of the interface to the ISP.
      # We must request accept RAs and request the PD prefix.
      "01-eth1" = {
        name = "eth1";
        networkConfig = {
          Description = "ISP interface";
          IPv6AcceptRA = true;
          #DHCP = false; # no need for legacy IP
        };
        linkConfig = {
          # We care about this interface when talking about being "online".
          # If this interface is in the `routable` state we can reach
          # others and they should be able to reach us.
          RequiredForOnline = "routable";
        };
        # This configures the DHCPv6 client part towards the ISPs DHCPv6 server.
        dhcpV6Config = {
          # We have to include a request for a prefix in our DHCPv6 client
          # request packets.
          # Otherwise the upstream DHCPv6 server wouldn't know if we want a
          # prefix or not.  Note: On some installation it makes sense to
          # always force that option on the DHPCv6 server since there are
          # certain CPEs that are just not setting this field but happily
          # accept the delegated prefix.
          PrefixDelegationHint  = "::/48";
        };
        ipv6PrefixDelegationConfig = {
          # Let networkd know that we would very much like to use DHCPv6
          # to obtain the "managed" information. Not sure why they can't
          # just take that from the upstream RAs.
          Managed = true;
        };
      };

      # Interface to the client. Here we should redistribute a /64 from
      # the prefix we received from the ISP.
      "01-eth2" = {
        name = "eth2";
        networkConfig = {
          Description = "Client interface";
          # the client shouldn't be allowed to send us RAs, that would be weird.
          IPv6AcceptRA = false;

          # Just delegate prefixes from the DHCPv6 PD pool.
          # If you also want to distribute a local ULA prefix you want to
          # set this to `yes` as that includes both static prefixes as well
          # as PD prefixes.
          IPv6PrefixDelegation = "dhcpv6";
        };
        # finally "act as router" (according to systemd.network(5))
        ipv6PrefixDelegationConfig = {
          RouterLifetimeSec = 300; # required as otherwise no RA's are being emitted

          # In a production environment you should consider setting these as well:
          #EmitDNS = true;
          #EmitDomains = true;
          #DNS= = "fe80::1"; # or whatever "well known" IP your router will have on the inside.
        };

        # This adds a "random" ULA prefix to the interface that is being
        # advertised to the clients.
        # Not used in this test.
        # ipv6Prefixes = [
        #   {
        #     ipv6PrefixConfig = {
        #       AddressAutoconfiguration = true;
        #       PreferredLifetimeSec = 1800;
        #       ValidLifetimeSec = 1800;
        #     };
        #   }
        # ];
      };

      # finally we are going to add a static IPv6 unique local address to
      # the "lo" interface.  This will serve as ICMPv6 echo target to
      # verify connectivity from the client to the router.
      "01-lo" = {
        name = "lo";
        addresses = [
          { addressConfig.Address = "FD42::1/128"; }
        ];
      };
    };
  };

  # make the network-online target a requirement, we wait for it in our test script
  systemd.targets.network-online.wantedBy = [ "multi-user.target" ];
}

And finally the most simple of them all the client.nix:

# This is the client behind the router. We should be receving router
# advertisements for both the ULA and the delegated prefix.
# All we have to do is boot with the default (networkd) configuration.
{
  systemd.services.systemd-networkd.environment.SYSTEMD_LOG_LEVEL = "debug";
  networking = {
    useNetworkd = true;
    useDHCP = false;
  };

  # make the network-online target a requirement, we wait for it in our test script
  systemd.targets.network-online.wantedBy = [ "multi-user.target" ];
}

While this looks pretty verbose the actual information density is pretty high once you remove all the comments.

Now that all the system configuraitons have been layed out we can start thinking about testing the setup.

The key results we want to gather from executing this test are:

  1. Does our router configure all the interfaces as we expect?
  2. Do we receive router advertisements on the client?
  3. Does the router request a prefix and advertise a subnet from that to the client?
  4. Is forwarding from / to the delegated prefix working?

We can do that fairly straightforward by filling in the previously left blank testScript attribute.

As a first step we start the router node and ensure that systemd-networkd had a chance to configure the interfaces and is ready to send out router advertisements.

# First start the router and wait for it it reach a state where we are
# certain networkd is up and it is able to send out RAs
router.start()
router.wait_for_unit("systemd-networkd.service")

Afterwards we can start the client and wait for it’s network stack to reache the online target. That means the system has brought up all the configured interfaces and whatever interface was configured as uplink has reached the desired state (routeable) by waiting for the network-online.target:

client.start()
client.wait_for_unit("network-online.target")

Since we previously configured a static address on the router node (FD42::1) we can verify that we got a valid default router via the router by attempting to reach said address.

# the static address on the router should become reachable within a few seconds
client.wait_until_succeeds("ping -6 -c 1 FD42::1")

We did use the wait_until_succeeds function since it might take a few seconds for RAs to arrive, everything being setup etc.

Once this succeeded we know that we have a working L3 connectivity between the router and the client. That likely means that our objective 2. is fullfilled, likely 1. is fine as well but we will continue being skeptical.

At this point the upstream router is still powered off and we shouldn’t be able to reach any of the addresses configured on there. Just to make sure we didn’t screw up elsewhere in the setup we can verify that:

# the global IP of the ISP router should still not be a reachable
router.fail("ping -6 -c 1 2001:DB8::")

Now that we are somehwat certain that the router and client setup is kinda working we can start the isp node and finally start checking the router delegation.

isp.start()
# Since for the ISP "being online" should have no real meaning we just
# wait for the target where all the units have been started.
# It probably still takes a few more seconds for all the RA timers to be
# fired etc..
isp.wait_for_unit("multi-user.target")

Since the router considers his uplink interface to be required to be online (as without uplink there is no external connectivity) we can now just wait for our router to reach the online state. Once that is reached we know that the ISP did hand out an IPv6 address and potentially an IA_PD prefix:

# wait until the uplink interface has a good status
router.wait_for_unit("network-online.target")
router.wait_until_succeeds("ping -6 -c1 2001:DB8::")

This now tells us that the router node has obtained a default route and some address that is routeable between it and the isp node. Shortly after that the client should obtain a public routeable addresss from the assigned prefix delegation. Yet again we give it a few seconds to propagate and thus we just use the wait_until_succeeds:

client.wait_until_succeeds("ping -6 -c1 2001:DB8::")

Now that we know that we have connectivity all the way client <-> router <-> isp we do a final check on the prefix we got assigned on the client. We expect an IPv6 prefix that has global scope and starts with the documentation prefix (2001:DB8:/32):

# verify that we got a globally scoped address in eth1 from the
# documentation prefix
ip_output = client.succeed("ip --json -6 address show dev eth1")

import json

ip_json = json.loads(ip_output)[0]
assert any(
    addr["local"].upper().startswith("2001:DB8:")
    for addr in ip_json["addr_info"]
    if addr["scope"] == "global"
)

The code from the examples can be retrieved here


This example is based on the NixOS test that I contributed. For my actual deployment I am using a custom router module that is being tested as part of my development work on the networkd DHCPv6 subnet id PR.


  1. These days there are (as far as I know) helpers that try to move these semantics over to the traditional wrappers. I’ve not seen any of them in years tho.. ↩︎