Managing DNS Servers with Ansible and Jenkins (Unbound, BIND)

DNS is a vital component of all computer networks. Also known as the "Internet Yellow Pages," this service is consumed by every household.

DNS services are typically deployed in several patterns to support users and systems:

  • DNS Forwarder: This deployment method is the most common. Everybody needs name resolution - caching and forwarding DNS results can save you bandwidth and improve localized performance. Most appliances can do this out of the box, and if they don't, try it out! It's really easy and will help you learn how DNS works.
    • Use case: You don't have your own domain and use computers.
  • Managed Public DNS: This deployment method is a significant majority of public domains are managed this way. You pay a third-party provider to manage the authoritative registration of public DNS records
    • Use case: You have a business and own a domain, but don't have any internal resources that you need to resolve.
    • Use case: You have a business and own a domain, but don't want to manage publicly resolvable nameservers
  • Private/Internal Nameserver: This deployment method is typically enterprise-specific, but is also required for home labs and all manner of weird experiments. Since it's not on the internet, we can violate any and all manner of Internet conventions.
    • The first component here is a recursive nameserver because even if you run a second server for recursive lookups, you still need a second server for recursive lookups.
    • Authoritative zones: For any given domain, keep a zone file to resolve against. This will include name-to-record (forward) objects and record-to-name (reverse) objects in separate files.
  • A method to change everything above, this has a high benefit:effort ratio.

For this post, we'll build the structure to have an internal nameserver managed completely from source control. This is surprisingly easy to get started - performing this work with abstraction is a welcome convenience, but not initially necessary as zone files are typically very simple and the application (Bind 9 or Unbound) is only one service.

To perform this, we'll follow this procedure:

  • Install the service - in this case, we'll use CentOS for Bind9 (my old setup), and Debian 11 for Unbound (because Debian 11 is new).
  • Extract the configuration file, and then export it into source control.
  • Create zone files, and then export it into source control
  • Automate delivery from source control to what we'll now call the "DNS Worker Node"

Bind9

 1dnf install bind  
 2find / -name 'named.conf'  
 3cat /etc/named/named.conf
 4```Example named configuration file (Credit where it's due, the vast majority of this configuration has been provided by CentOS and Bind9 - I set the _forwarders, allow-query, listen-on,_ and _zone_ directives:```
 5options {  
 6        listen-on { any; };  
 7        listen-on-v6 { any; };  
 8        directory       "/var/named";  
 9        dump-file       "/var/named/data/cache_dump.db";  
10        statistics-file "/var/named/data/named_stats.txt";  
11        memstatistics-file "/var/named/data/named_mem_stats.txt";  
12        secroots-file   "/var/named/data/named.secroots";  
13        recursing-file  "/var/named/data/named.recursing";  
14        allow-query { 10.0.0.0/8; 127.0.0.1; 2000::/3; };  
15        forwarders { 1.1.1.1; 9.9.9.9; };  
16        /*  
17         - If you are building an AUTHORITATIVE DNS server, do NOT enable recursion.  
18         - If you are building a RECURSIVE (caching) DNS server, you need to enable  
19           recursion.  
20         - If your recursive DNS server has a public IP address, you MUST enable access  
21           control to limit queries to your legitimate users. Failing to do so will  
22           cause your server to become part of large scale DNS amplification  
23           attacks. Implementing BCP38 within your network would greatly  
24           reduce such attack surface  
25        */  
26        recursion yes;  
27  
28        dnssec-enable yes;  
29        dnssec-validation yes;  
30  
31        managed-keys-directory "/var/named/dynamic";  
32  
33        pid-file "/run/named/named.pid";  
34        session-keyfile "/run/named/session.key";  
35  
36        /* https://fedoraproject.org/wiki/Changes/CryptoPolicy */  
37        include "/etc/crypto-policies/back-ends/bind.config";  
38          
39};  
40  
41zone "engyak.net" in {  
42        allow-transfer { any; };  
43        file "/etc/named/engyak.net.zone";  
44        type master;  
45};  

Then, let's build a zone file in source control. Please note that there are additional conventions that should be followed when creating new DNS zone records, this is just an example file that will run!

 1$TTL 2d  
 2@               SOA             ns.engyak.net. hostmaster.engyak.net  (  
 3                                1      ; serial  
 4                                3600            ; refresh  
 5                                600             ; retry  
 6                                608400          ; expiry  
 7                                3600 ) ;  
 8;  
 9;  
10engyak.net.     IN NS           ns.engyak.net.  
11ns              IN A            10.0.0.1  
12johnnyfive      IN A            10.1.1.1  
13duncanidaho     IN A            10.2.2.2

Copy the named.conf contents into a new source code repository or your existing one, preferably in an organized fashion. Ansible playbook execution is very straightforward. I'd recommend building this in source control as well - see above note about potential process improvements

 1---  
 2- hosts: ns.engyak.net  
 3  tasks:  
 4    - name: "Update DNS Zones!"  
 5      copy:  
 6        src: zonefiles/engyak.net  
 7        dest: /etc/named/engyak.net.zone  
 8        mode: "0644"  
 9    - name: "Update DNS Config!"  
10      copy:  
11        src: conf.d/ns.engyak.net/named.conf  
12        dest: /etc/named.conf  
13        mode: "0640"  
14    - name: "Restart Named!"  
15      service:  
16        name: "named"  
17        state: "restarted"  

Any time you run this playbook it will download a fresh configuration and zone file, then restart Bind9.

As a cherry on top, let's make this process smart - if we want to automatically deploy changes to DNS from source control, we need a CI Tool like Jenkins. Start off by creating a new Freeform pipeline to "Watch SCM" - yes, this isn't a real repository.

Source Code Management

Build Triggers

Build Commands

That's it - add entries, live long, and prosper! Since the Ansible playbook and supporting files are fetched via source control, the only setup required on a DNS worker node is to establish a relationship between it and the CI tool, ex. SSH authentication.

Unbound

Unbound is a newer DNS server project and has quite a few interesting properties. I've been using BIND for well over a decade - and Unbound aims to change a few things, notably:

Oddly enough, there is no features list for this software package, but pretty much everything else is impressively documented. Let's start the installation:

1apt install unbound  
2cat /usr/share/doc/unbound/examples/unbound.conf

Unbound can use the same zonefile format as BIND, so we only need to create a new config file to migrate things over. Note: This is not a production-ready configuration, it's just enough to get me started.

As I learn more about Unbound, I'll be using source control to implement changes / implement a rollback - an important benefit when making lots of mistakes!

 1# The server clause sets the main parameters.  
 2server:  
 3        verbosity: 1  
 4        num-threads: 2  
 5        interface: 0.0.0.0  
 6        interface: ::0  
 7        port: 53  
 8        prefer-ip4: no  
 9        edns-buffer-size: 1232  
10  
11        # Maximum UDP response size (not applied to TCP response).  
12        # Suggested values are 512 to 4096. Default is 4096. 65536 disables it.  
13        max-udp-size: 4096  
14        msg-buffer-size: 65552  
15        udp-connect: yes  
16        unknown-server-time-limit: 376  
17  
18        do-ip4: yes  
19        do-ip6: yes  
20        do-udp: yes  
21        do-tcp: yes  
22  
23        # control which clients are allowed to make (recursive) queries  
24        # to this server. Specify classless netblocks with /size and action.  
25        # By default everything is refused, except for localhost.  
26        access-control: 10.0.0.0/8 allow  
27        access-control: 127.0.0.0/8 allow  
28  
29        private-domain: "engyak.net"  
30        caps-exempt: "engyak.net"  
31        domain-insecure: "engyak.net"  
32  
33        private-address: 10.0.0.0/8  
34  
35        # cipher setting for TLSv1.2  
36        tls-ciphers: "ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256"  
37        # cipher setting for TLSv1.3  
38        tls-ciphersuites: "TLS_AES_128_GCM_SHA256:TLS_AES_128_CCM_8_SHA256:TLS_AES_128_CCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256"  
39  
40# Python config section. To enable:  
41# o use --with-pythonmodule to configure before compiling.  
42# o list python in the module-config string (above) to enable.  
43#   It can be at the start, it gets validated results, or just before  
44#   the iterator and process before DNSSEC validation.  
45# o and give a python-script to run.  
46python:  
47        # Script file to load  
48        # python-script: "/etc/unbound/ubmodule-tst.py"  
49  
50# Dynamic library config section. To enable:  
51# o use --with-dynlibmodule to configure before compiling.  
52# o list dynlib in the module-config string (above) to enable.  
53#   It can be placed anywhere, the dynlib module is only a very thin wrapper  
54#   to load modules dynamically.  
55# o and give a dynlib-file to run. If more than one dynlib entry is listed in  
56#   the module-config then you need one dynlib-file per instance.  
57dynlib:  
58        # Script file to load  
59        # dynlib-file: "/etc/unbound/dynlib.so"  
60  
61# Remote control config section.  
62remote-control:  
63        # Enable remote control with unbound-control(8) here.  
64        # set up the keys and certificates with unbound-control-setup.  
65        control-enable: no  
66  
67# Authority zones  
68# The data for these zones is kept locally, from a file or downloaded.  
69# The data can be served to downstream clients, or used instead of the  
70# upstream (which saves a lookup to the upstream).  The first example  
71# has a copy of the root for local usage.  The second serves example.org  
72# authoritatively.  zonefile: reads from file (and writes to it if you also  
73# download it), primary: fetches with AXFR and IXFR, or url to zonefile.  
74# With allow-notify: you can give additional (apart from primaries) sources of  
75# notifies.  
76forward-zone:  
77      name: "."  
78      forward-addr: 1.1.1.1  
79      forward-addr: 9.9.9.9  
80auth-zone:  
81      name: "engyak.net"  
82      for-downstream: yes  
83      for-upstream: yes  
84      zonefile: "engyak.net.zone"  

To automate file delivery here, we'll use a (similar) playbook for Unbound. The Jenkins configuration will not need to be modified, because the playbook will automatically be re-executed.

 1---  
 2- hosts: ns.engyak.net  
 3  tasks:  
 4    - name: "Update DNS Zones!"  
 5      copy:  
 6        src: zonefiles/engyak.net  
 7        dest: /etc/unbound/engyak.net.zone  
 8        mode: "0644"  
 9    - name: "Update DNS Config!"  
10      copy:  
11        src: conf.d/ns.engyak.net/unbound.conf  
12        dest: /etc/unbound.conf  
13        mode: "0640"  
14    - name: "Restart Unbound!"  
15      service:  
16        name: "unbound"  
17        state: "restarted"

Some Thoughts

This method of building DNS records from a source of truth does replace the master-slave (sorry guys, BIND's terms are not my own!) relationship older name servers will typically use. Personally, I like this method of propagation.

The biggest upside here is that a DNS worker node being unavailable does not prevent an engineer from adding/modifying records as long as recursive name servers support multiple resolvers.

It is eventually consistent, as the orchestrator will update every worker node for you. This may be slower or faster, depending on TTL.

The Ansible playbook I used here will kill your DNS node if you push it into an invalid configuration, so this is probably not production-worthy without additional work.

If you would rather purchase a platform instead of building this capability with F/OSS components, this is basically how Infoblox Grid works.

It'd be really neat to abstract software-specific constructs, which can be done with Python and Jinja2 (or just Ansible and Jinja2!)