VM Deployment Pipelines with Proxmox
Decoupled approaches to deployment of IaaS workloads are the way of the future.
Here, we'll try to construct a VM deployment pipeline leveraging GitHub Actions and Ansible's community modules.
Proxmox Setup
- Not featured here: Loading a VM ISO is particular to the Proxmox deployment, but it's necessary for future steps.
Let's create a VM named deb12.6-template:
I set a separate VM ID range for templates to simplify visual automatic sorting.
Note: Paravirtualized hardware is still the optimal choice, like with vSphere - but in this case, VirtIO is the code supplier.
Note: SSD Emulation and qemu-agent are required for virtual disk reclamation with QEMU. This is particularly important in my lab.
In this installation, I'm using paravirtualized network adapters and have separated my management(vmbr0) and data plane(vmbr1)
Debian Linux Setup
I'll skip the Linux installer parts for brevity, Debian's installer is excellent and easy to use.
At a high level, we'll want to do some preparatory steps before declaring this a usable base image:
- Create users
- Recommended approach: Create a bootstrap user, then shred it
- Leave the
bootstrapuser with an SSH key on the base image - After creation, build a
takeoverplaybook that installs the latest and greatest username table,sssd, SSH keys, APM, anything with confidential cryptographic material that should not be left unencrypted on the hypervisor - This won't slow the VM deployment speed by as much as you think
- Leave the
- Recommended approach: Create a bootstrap user, then shred it
- Install packages
- This is just a list of some basics that I prefer to add to each machine. It's more network-centric; anything more comprehensive should be part of a build playbook specific to whatever's being deployed.
- Note: This is an Ansible playbook, and therefore, it needs Ansible to run (
apt install ansible)
1---
2- name: "Debian machine prep"
3 hosts: localhost
4 tasks:
5 - name: "Install standard packages"
6 ansible.builtin.apt:
7 pkg:
8 - 'curl'
9 - 'dnsutils'
10 - 'diffutils'
11 - 'ethtool'
12 - 'git'
13 - 'mtr'
14 - 'net-tools'
15 - 'netcat-traditional'
16 - 'python3-requests'
17 - 'python3-jinja2'
18 - 'tcpdump'
19 - 'telnet'
20 - 'traceroute'
21 - 'qemu-guest-agent'
22 - 'vim'
23 - 'wget'
- Clean up the disk. This will make our base image more compact - each clone will inherit any wasted space, so consider it a 10,20x savings in disk usage. I leave this as a file on the base image and name it
reset_vm.sh:
1#!/bin/bash
2
3# Clean Apt
4apt clean
5
6# Cleaning logs.
7if [ -f /var/log/audit/audit.log ]; then
8 cat /dev/null > /var/log/audit/audit.log
9fi
10if [ -f /var/log/wtmp ]; then
11 cat /dev/null > /var/log/wtmp
12fi
13if [ -f /var/log/lastlog ]; then
14 cat /dev/null > /var/log/lastlog
15fi
16
17# Cleaning udev rules.
18if [ -f /etc/udev/rules.d/70-persistent-net.rules ]; then
19 rm /etc/udev/rules.d/70-persistent-net.rules
20fi
21
22# Cleaning the /tmp directories
23rm -rf /tmp/*
24rm -rf /var/tmp/*
25
26# Cleaning the SSH host keys
27rm -f /etc/ssh/ssh_host_*
28
29# Cleaning the machine-id
30truncate -s 0 /etc/machine-id
31rm /var/lib/dbus/machine-id
32ln -s /etc/machine-id /var/lib/dbus/machine-id
33
34# Cleaning the shell history
35unset HISTFILE
36history -cw
37echo > ~/.bash_history
38rm -fr /root/.bash_history
39
40# Truncating hostname, hosts, resolv.conf and setting hostname to localhost
41truncate -s 0 /etc/{hostname,hosts,resolv.conf}
42hostnamectl set-hostname localhost
43
44# Clean cloud-init - deprecated because cloud-init isn't currently used
45# cloud-init clean -s -l
46
47# Force a filesystem sync
48sync
Shutdown the Virtual Machine. I prefer to start it back up and shut it down from the hypervisor to ensure that qemu-guest-agent is working properly.
Deployment Pipeline
First, we will want to create an API token under "Datacenter -> Permissions -> API Tokens":
There are some oddities with the Ansible proxmoxer based module and Ansible to keep in mind:
api_useris needed and used by the API client, formatted as{{ user }}@domainapi_token_idis not the same as the output from the command, it's what you put into the "Token ID" field.{{ api_user}}!{{ api_token_id }}should form the combined credential presented to the API, and match the created token.
If you attempt to use the output from the API creation screen under api_user or api_token_id, it'll return a 401 Invalid user without much explanation as to what might be the issue.
Here's the pipeline. Github's primary job is to set up the Python/Ansible environment, and translate the workflow inputs into something that Ansible can properly digest.
I also added some cat steps - this allows us to use the GitHub Actions log to store intent until Netbox registration completes.
1---
2name: "On-Demand: Build VM on Proxmox"
3
4on:
5 workflow_dispatch:
6 inputs:
7 machine_name:
8 description: "Machine Name"
9 required: true
10 default: "examplename"
11 machine_id:
12 description: "VM ID (can't re-use)"
13 required: true
14 template:
15 description: "VM Template Name"
16 required: true
17 type: choice
18 options:
19 - deb12.6-template
20 default: "deb12.6-template"
21 hardware_cpus:
22 description: "VM vCPU Count"
23 required: true
24 default: "1"
25 hardware_memory:
26 description: "VM Memory Allocation (in MB)"
27 required: true
28 default: "512"
29
30permissions:
31 contents: read
32
33jobs:
34 build:
35 runs-on: self-hosted
36 steps:
37 - uses: actions/checkout@v4
38 - name: Create Variable YAML File
39 run: |
40 cat <<EOF > roles/proxmox_kvm/parameters.yaml
41 ---
42 vm_data:
43 name: "${{ github.event.inputs.machine_name }}"
44 id: ${{ github.event.inputs.machine_id }}
45 template: "${{ github.event.inputs.template }}"
46 node: node
47 hardware:
48 cpus: ${{ github.event.inputs.hardware_cpus }}
49 memory: ${{ github.event.inputs.hardware_memory }}
50 storage: ssd-tier
51 format: qcow2
52 EOF
53 - name: Build VM
54 run: |
55 cd roles/proxmox_kvm/
56 cat parameters.yaml
57 python3 -m venv .
58 source bin/activate
59 python3 -m pip install --upgrade pip
60 python3 -m pip install -r requirements.txt
61 python3 --version
62 ansible --version
63
64 export PAPIUSER="${{ secrets.PAPIUSER }}"
65 export PAPI_TOKEN="${{ secrets.PAPI_TOKEN }}"
66 export PAPI_SECRET="${{ secrets.PAPI_SECRET }}"
67 export PHOSTNAME="${{ secrets.PHOSTNAME }}"
68 export NETBOX_TOKEN="${{ secrets.NETBOX_TOKEN }}"
69 export NETBOX_URL="${{ secrets.NETBOX_URL }}"
70 export NETBOX_CLUSTER="${{ secrets.NETBOX_CLUSTER_PROX }}"
71 ansible-playbook build_vm_prox.yml
In addition, a requirements.txt is required by GitHub to set up the venv, and belongs in the role folder (roles/proxmox_kvm as above):
1###### Requirements without Version Specifiers ######
2pytz
3netaddr
4django
5jinja2
6requests
7pynetbox
8
9###### Requirements with Version Specifiers ######
10ansible >= 8.4.0 # Mostly just don't use old Ansible (e.g. v2, v3)
11proxmoxer >= 2.0.0
This Ansible playbook also integrates Netbox, as my vSphere workflow did, and uses a common schema to simplify code re-use. There are a few quirks with the Proxmox playbooks:
- There's no module to grab VM Guest network information, but the API provides it, so I can get it with
uri - Proxmox has a nasty habit of breaking Ansible with JSON keys that include
-. The best way to fix it is with a debug action:{{ prox_network_result.json.data | replace('-','_') }} - Proxmox's VM copy needs a timeout configured, and announces it's done before the VM is ready for actions. I added an
ansible.builtin.pausestep before starting the VM, and after (to allow it to boot)
1---
2- name: "Build VM on Proxmox"
3 hosts: localhost
4 gather_facts: true
5 # Before executing ensure that the prerequisites are installed
6 # `ansible-galaxy collection install netbox.netbox`
7 # `python3 -m pip install aiohttp pynetbox`
8 # We start with a pre-check playbook, if it fails, we don't want to
9 # make changes
10 any_errors_fatal: true
11 vars_files:
12 - "parameters.yaml"
13
14 tasks:
15 - name: "Debug"
16 ansible.builtin.debug:
17 msg: '{{ vm_data }}'
18 - name: "Test connectivity and authentication"
19 community.general.proxmox_node_info:
20 api_host: '{{ lookup("env", "PHOSTNAME") }}'
21 api_user: '{{ lookup("env", "PAPIUSER") }}'
22 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
23 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
24 register: prox_node_result
25 - name: "Display Node Data"
26 ansible.builtin.debug:
27 msg: '{{ prox_node_result }}'
28 - name: "Build the VM"
29 community.general.proxmox_kvm:
30 api_host: '{{ lookup("env", "PHOSTNAME") }}'
31 api_user: '{{ lookup("env", "PAPIUSER") }}'
32 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
33 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
34 name: '{{ vm_data.name }}'
35 node: '{{ vm_data.node }}'
36 storage: '{{ vm_data.hardware.storage }}'
37 newid: '{{ vm_data.id }}'
38 clone: '{{ vm_data.template }}'
39 format: '{{ vm_data.hardware.format }}'
40 timeout: 500
41 state: present
42 - name: "Wait for the VM to fully register"
43 ansible.builtin.pause:
44 seconds: 15
45 - name: "Start the VM"
46 community.general.proxmox_kvm:
47 api_host: '{{ lookup("env", "PHOSTNAME") }}'
48 api_user: '{{ lookup("env", "PAPIUSER") }}'
49 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
50 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
51 name: '{{ vm_data.name }}'
52 state: started
53 - name: "Wait for the VM to fully boot"
54 ansible.builtin.pause:
55 seconds: 45
56 - name: "Get VM information"
57 community.general.proxmox_vm_info:
58 api_host: '{{ lookup("env", "PHOSTNAME") }}'
59 api_user: '{{ lookup("env", "PAPIUSER") }}'
60 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
61 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
62 vmid: '{{ vm_data.id }}'
63 register: prox_vm_result
64 - name: "Report the VM!"
65 ansible.builtin.debug:
66 var: prox_vm_result
67 - name: "Fetch VM Networking information"
68 ansible.builtin.uri:
69 url: 'https://{{ lookup("env", "PHOSTNAME") }}:8006/api2/json/nodes/{{ vm_data.node }}/qemu/{{ vm_data.id }}/agent/network-get-interfaces'
70 method: 'GET'
71 headers:
72 Content-Type: 'application/json'
73 Authorization: 'PVEAPIToken={{ lookup("env", "PAPIUSER") }}!{{ lookup("env", "PAPI_TOKEN") }}={{ lookup("env", "PAPI_SECRET") }}'
74 validate_certs: false
75 register: prox_network_result
76 - name: "Refactor Network Information"
77 ansible.builtin.debug:
78 msg: "{{ prox_network_result.json.data | replace('-','_') }}"
79 register: prox_network_result_modified
80 - name: "Register the VM in Netbox!"
81 netbox.netbox.netbox_virtual_machine:
82 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
83 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
84 validate_certs: false
85 data:
86 cluster: '{{ lookup("env", "NETBOX_CLUSTER") }}'
87 name: '{{ vm_data.name }}'
88 description: 'Built by the GH Actions Pipeline!'
89 local_context_data: '{{ prox_vm_result }}'
90 memory: '{{ vm_data.hardware.memory }}'
91 vcpus: '{{ vm_data.hardware.cpus }}'
92 - name: "Configure VM Interface in Netbox!"
93 netbox.netbox.netbox_vm_interface:
94 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
95 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
96 validate_certs: false
97 data:
98 name: '{{ vm_data.name }}_intf_{{ item.hardware_address | replace(":", "") | safe }}'
99 virtual_machine: '{{ vm_data.name }}'
100 vrf: 'Campus'
101 mac_address: '{{ item.hardware_address }}'
102 with_items: '{{ prox_network_result_modified.msg.result }}'
103 when: item.hardware_address != '00:00:00:00:00:00'
104 - name: "Reserve IP"
105 netbox.netbox.netbox_ip_address:
106 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
107 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
108 validate_certs: false
109 data:
110 address: '{{ item.ip_addresses[0].ip_address }}/{{ item.ip_addresses[0].prefix }}'
111 vrf: 'Campus'
112 assigned_object:
113 virtual_machine: '{{ vm_data.name }}'
114 state: present
115 with_items: '{{ prox_network_result_modified.msg.result }}'
116 when: item.hardware_address != '00:00:00:00:00:00'
117 - name: "Finalize the VM in Netbox!"
118 netbox.netbox.netbox_virtual_machine:
119 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
120 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
121 validate_certs: false
122 data:
123 cluster: '{{ lookup("env", "NETBOX_CLUSTER") }}'
124 tags:
125 - 'lab_debian_machines'
126 - 'lab_linux_machines'
127 - 'lab_apt_updates'
128 name: '{{ vm_data.name }}'
129 primary_ip4:
130 address: '{{ item.ip_addresses[0].ip_address }}/{{ item.ip_addresses[0].prefix }}'
131 vrf: "Campus"
132 with_items: '{{ prox_network_result_modified.msg.result }}'
133 when: item.hardware_address != '00:00:00:00:00:00'
Conclusion
Overall, the Proxmox API/playbooks are quite a bit simpler to use than the VMware ones. The proxmoxer based modules are relatively feature complete compared to vmware_rest, but the largest exception I found (examples not in this post) was that I could always fall back to Ansible's comprehensive Linux foundation to fill any gaps I needed to. It's a refreshing change.