Saturday, October 15, 2022

Gathering and Using Data from Cisco NX-OS with Ansible Modules

easy button

Reliably executing repetitive tasks with automation is easy (after the work is done).

Given enough work, self-built automation can be easy to consume. Non-consumers (engineers) need to focus on reliability and repeatability, but occasionally there's an opportunity to save time and simplify lives directly.

Information gathering with Ansible is a powerful tool, making the level of difficulty to perform a check on one network node roughly equal to the effort on 2, or even one hundred. Here's a quick and easy way to get started.

Ansible Inventory

Ansible likes to know where each managed node lives, and provides the inventory capability to organize similar devices for remote management. Not all network automation endpoints use the inventory feature, so ensure that you read the published documentation first. 

Note: The easiest way to check inventory dependency is to verify if there are directives in the playbook named hostname, username, or password. If they exist, that module probably does not use inventory.

Ansible supports two formats for an on-controller inventory, conf (Windows-like) and YAML (Linux-like). Here's an example in YAML, I personally find it easier to read:

---
  nxos_example_001:
    hosts:
      nexus_1:
        ansible_host: "1.1.1.1"
      nexus_2:
        ansible_host: "2.2.2.2"
      vars:
        ansible_user: "admin"
  nxos_all:
    children:
      nxos_example_001:

We have a little bit to unpack here:

  • The first hierarchical tier is for groups, which can contain other groups if you use the children: directive (see nxos_all as an example)
  • vars: will specify variables to commonly use across all members of that group
  • ansible_host is used to specify an address - and is useful with dual stack environments (or ones that don't have DNS)

Ansible Facts

Ansible stores all of its runtime variables for a given playbook as facts. This is held as a Python dict at runtime by Ansible Engine, and the debug: module allows an engineer to print the output to stdout:

---
- hosts: localhost
  connection: local
  tasks:
    - name: "Print it!"
      debug:
        var: lookup('ansible.builtin.env', 'PATH')
    - name: "Print it, but with msg!"
      debug:
        msg:
          - "The system environment PATH is: {{ lookup('ansible.builtin.env', 'PATH') }}"
          - "Wise engineers don't use this feature to print passwords"

Running this playbook will produce the following:

ansible-playbook debug.yml 
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'

PLAY [localhost] *************************************************************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *******************************************************************************************************************************************************************************************************************************************************************
ok: [localhost]

TASK [Print it!] *************************************************************************************************************************************************************************************************************************************************************************
ok: [localhost] => {
    "lookup('ansible.builtin.env', 'PATH')": "/root/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/bin/remote-cli:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
}

TASK [Print it, but with msg!] ***********************************************************************************************************************************************************************************************************************************************************
ok: [localhost] => {
    "msg": [
        "The system environment PATH is: /root/.vscode-server/bin/d045a5eda657f4d7b676dedbfa7aab8207f8a075/bin/remote-cli:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
        "Wise engineers don't use this feature to print passwords"
    ]
}

PLAY RECAP *******************************************************************************************************************************************************************************************************************************************************************************
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0 

msg: is effective for formatted output, while var: is considerably simpler when dumping a large dictionary. var: does not require Jinja formatting, which may cause playbooks to be simpler.

Let's apply this to a Cisco NX-OS Node. We can register command output from the nxos_facts module.

Note: The example provided below is the "new way", where Network modules follow the Ansible rules. If using older versions of Ansible (Ansible 2), the following may not be fully available!

First, we need to update the Ansible inventory. We will be using the API method to collect data, and it requires multiple new variables:

  • ansible_network_os: Instructs Ansible on what module to use for that system
  • ansible_connection: Instructs Ansible on what transport to use (HTTP API, SSH)
  • ansible_httpapi_use_ssl: Instructs Ansible to use HTTPS
---
  nxos_example_001:
    hosts:
      nexus_1:
        ansible_host: "1.1.1.1"
      nexus_2:
        ansible_host: "2.2.2.2"
      vars:
        ansible_user: "admin"
        ansible_network_os: 'cisco.nxos.nxos'
        ansible_connection: ansible.netcommon.httpapi
        ansible_httpapi_password: ''
        ansible_httpapi_use_ssl: 'yes'
        ansible_httpapi_validate_certs: 'no'
  nxos_all:
    children:
      nxos_example_001:

The updated inventory allows us to run extremely simple playbooks to gather data

---
- hosts: nxos_machines
  tasks:
    - name: "Gather facts via NXAPI"
      cisco.nxos.nxos_facts:
        gather_subset: 'min'
        gather_network_resources:
          - 'interfaces'
      register: nxos_facts_gathered
    - name: "Print it!"
      debug:
        var: nxos_facts_gathered
ansible-playbook debug_nxos_facts.yml 

PLAY [nxos_machines] *************************************************************************************************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] ***********************************************************************************************************************************************************************************************************************************************************************************************
[WARNING]: Ignoring timeout(10) for cisco.nxos.nxos_facts
ok: [nx-1]

TASK [Gather facts via NXAPI] ****************************************************************************************************************************************************************************************************************************************************************************************
ok: [nx-1]

TASK [Print it!] *****************************************************************************************************************************************************************************************************************************************************************************************************
ok: [nx-1] => {
    "nxos_facts_gathered": {
        "ansible_facts": {
            "ansible_net_api": "nxapi",
            "ansible_net_gather_network_resources": [
                "interfaces"
            ],
            "ansible_net_gather_subset": [
                "default"
            ],
            "ansible_net_hostname": "AnsLabN9k-1",
            "ansible_net_image": "bootflash:///nxos.9.3.8.bin",
            "ansible_net_license_hostid": "",
            "ansible_net_model": "Nexus9000 C9300v",
            "ansible_net_platform": "N9K-C9300v",
            "ansible_net_python_version": "3.9.2",
            "ansible_net_serialnum": "",
            "ansible_net_system": "nxos",
            "ansible_net_version": "9.3(8)",
            "ansible_network_resources": {
                "interfaces": [
                    {
                        "name": "Ethernet1/1"
                    },
                    {
                        "name": "mgmt0"
                    }
                ]
            }
        },
        "changed": false,
        "failed": false
    }
}

PLAY RECAP ***********************************************************************************************************************************************************************************************************************************************************************************************************
nx-1                       : ok=3    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   

Ansible's Inventory feature enables us to scale per node without any additional code - the previous playbook will execute once on every inventory object in the group, which allows an engineer to thoroughly test a playbook on lab resources with some level of separation.

Deliberate automation design will bear fruit here - as safety is key when developing and testing automation. Like with previous automation-centric posts, thorough, comprehensive testing of automation for reliability is a social responsibility when creating tools. 

Establishing a separate CI/CD tooling set to target a lab (or CML, as in this case!) enables us to add additional safeguards against accidental changes, such as ACLs/Firewall policies preventing access from Test CI/CD -> Production network assets. Tools like CML take it even further by allowing an engineer to spin up amnesic NOS instances to run code against.

Here's an applicable instance. Recently, Cisco disclosed a vulnerability with Cisco Fabric Services - and most environments don't need that service running. This is an aggressive fix - but with Ansible we can check for the service and disable it only if it's running, and then check again afterwards. This illustrates the value of idempotency, or the practice of running repeated executions safely.

No comments:

Post a Comment

Popular Posts