VM Deployment Pipelines with Proxmox
Decoupled approaches to deployment of IaaS workloads are the way of the future.
Here, we'll try to construct a VM deployment pipeline leveraging GitHub Actions and Ansible's community modules.
Proxmox Setup
- Not featured here: Loading a VM ISO is particular to the Proxmox deployment, but it's necessary for future steps.
Let's create a VM named deb12.6-template
:
I set a separate VM ID range for templates to simplify visual automatic sorting.
Note: Paravirtualized hardware is still the optimal choice, like with vSphere - but in this case, VirtIO
is the code supplier.
Note: SSD Emulation and qemu-agent
are required for virtual disk reclamation with QEMU. This is particularly important in my lab.
In this installation, I'm using paravirtualized network adapters and have separated my management(vmbr0
) and data plane(vmbr1
)
Debian Linux Setup
I'll skip the Linux installer parts for brevity, Debian's installer is excellent and easy to use.
At a high level, we'll want to do some preparatory steps before declaring this a usable base image:
- Create users
- Recommended approach: Create a bootstrap user, then shred it
- Leave the
bootstrap
user with an SSH key on the base image - After creation, build a
takeover
playbook that installs the latest and greatest username table,sssd
, SSH keys, APM, anything with confidential cryptographic material that should not be left unencrypted on the hypervisor - This won't slow the VM deployment speed by as much as you think
- Leave the
- Recommended approach: Create a bootstrap user, then shred it
- Install packages
- This is just a list of some basics that I prefer to add to each machine. It's more network-centric; anything more comprehensive should be part of a build playbook specific to whatever's being deployed.
- Note: This is an Ansible playbook, and therefore, it needs Ansible to run (
apt install ansible
)
1---
2- name: "Debian machine prep"
3 hosts: localhost
4 tasks:
5 - name: "Install standard packages"
6 ansible.builtin.apt:
7 pkg:
8 - 'curl'
9 - 'dnsutils'
10 - 'diffutils'
11 - 'ethtool'
12 - 'git'
13 - 'mtr'
14 - 'net-tools'
15 - 'netcat-traditional'
16 - 'python3-requests'
17 - 'python3-jinja2'
18 - 'tcpdump'
19 - 'telnet'
20 - 'traceroute'
21 - 'qemu-guest-agent'
22 - 'vim'
23 - 'wget'
- Clean up the disk. This will make our base image more compact - each clone will inherit any wasted space, so consider it a 10,20x savings in disk usage. I leave this as a file on the base image and name it
reset_vm.sh
:
1#!/bin/bash
2
3# Clean Apt
4apt clean
5
6# Cleaning logs.
7if [ -f /var/log/audit/audit.log ]; then
8 cat /dev/null > /var/log/audit/audit.log
9fi
10if [ -f /var/log/wtmp ]; then
11 cat /dev/null > /var/log/wtmp
12fi
13if [ -f /var/log/lastlog ]; then
14 cat /dev/null > /var/log/lastlog
15fi
16
17# Cleaning udev rules.
18if [ -f /etc/udev/rules.d/70-persistent-net.rules ]; then
19 rm /etc/udev/rules.d/70-persistent-net.rules
20fi
21
22# Cleaning the /tmp directories
23rm -rf /tmp/*
24rm -rf /var/tmp/*
25
26# Cleaning the SSH host keys
27rm -f /etc/ssh/ssh_host_*
28
29# Cleaning the machine-id
30truncate -s 0 /etc/machine-id
31rm /var/lib/dbus/machine-id
32ln -s /etc/machine-id /var/lib/dbus/machine-id
33
34# Cleaning the shell history
35unset HISTFILE
36history -cw
37echo > ~/.bash_history
38rm -fr /root/.bash_history
39
40# Truncating hostname, hosts, resolv.conf and setting hostname to localhost
41truncate -s 0 /etc/{hostname,hosts,resolv.conf}
42hostnamectl set-hostname localhost
43
44# Clean cloud-init - deprecated because cloud-init isn't currently used
45# cloud-init clean -s -l
46
47# Force a filesystem sync
48sync
Shutdown the Virtual Machine. I prefer to start it back up and shut it down from the hypervisor to ensure that qemu-guest-agent
is working properly.
Deployment Pipeline
First, we will want to create an API token under "Datacenter -> Permissions -> API Tokens":
There are some oddities with the Ansible proxmoxer
based module and Ansible to keep in mind:
api_user
is needed and used by the API client, formatted as{{ user }}@domain
api_token_id
is not the same as the output from the command, it's what you put into the "Token ID" field.{{ api_user}}!{{ api_token_id }}
should form the combined credential presented to the API, and match the created token.
If you attempt to use the output from the API creation screen under api_user
or api_token_id
, it'll return a 401 Invalid user
without much explanation as to what might be the issue.
Here's the pipeline. Github's primary job is to set up the Python/Ansible environment, and translate the workflow inputs into something that Ansible can properly digest.
I also added some cat
steps - this allows us to use the GitHub Actions log to store intent until Netbox registration completes.
1---
2name: "On-Demand: Build VM on Proxmox"
3
4on:
5 workflow_dispatch:
6 inputs:
7 machine_name:
8 description: "Machine Name"
9 required: true
10 default: "examplename"
11 machine_id:
12 description: "VM ID (can't re-use)"
13 required: true
14 template:
15 description: "VM Template Name"
16 required: true
17 type: choice
18 options:
19 - deb12.6-template
20 default: "deb12.6-template"
21 hardware_cpus:
22 description: "VM vCPU Count"
23 required: true
24 default: "1"
25 hardware_memory:
26 description: "VM Memory Allocation (in MB)"
27 required: true
28 default: "512"
29
30permissions:
31 contents: read
32
33jobs:
34 build:
35 runs-on: self-hosted
36 steps:
37 - uses: actions/checkout@v4
38 - name: Create Variable YAML File
39 run: |
40 cat <<EOF > roles/proxmox_kvm/parameters.yaml
41 ---
42 vm_data:
43 name: "${{ github.event.inputs.machine_name }}"
44 id: ${{ github.event.inputs.machine_id }}
45 template: "${{ github.event.inputs.template }}"
46 node: node
47 hardware:
48 cpus: ${{ github.event.inputs.hardware_cpus }}
49 memory: ${{ github.event.inputs.hardware_memory }}
50 storage: ssd-tier
51 format: qcow2
52 EOF
53 - name: Build VM
54 run: |
55 cd roles/proxmox_kvm/
56 cat parameters.yaml
57 python3 -m venv .
58 source bin/activate
59 python3 -m pip install --upgrade pip
60 python3 -m pip install -r requirements.txt
61 python3 --version
62 ansible --version
63
64 export PAPIUSER="${{ secrets.PAPIUSER }}"
65 export PAPI_TOKEN="${{ secrets.PAPI_TOKEN }}"
66 export PAPI_SECRET="${{ secrets.PAPI_SECRET }}"
67 export PHOSTNAME="${{ secrets.PHOSTNAME }}"
68 export NETBOX_TOKEN="${{ secrets.NETBOX_TOKEN }}"
69 export NETBOX_URL="${{ secrets.NETBOX_URL }}"
70 export NETBOX_CLUSTER="${{ secrets.NETBOX_CLUSTER_PROX }}"
71 ansible-playbook build_vm_prox.yml
In addition, a requirements.txt
is required by GitHub to set up the venv
, and belongs in the role folder (roles/proxmox_kvm
as above):
1###### Requirements without Version Specifiers ######
2pytz
3netaddr
4django
5jinja2
6requests
7pynetbox
8
9###### Requirements with Version Specifiers ######
10ansible >= 8.4.0 # Mostly just don't use old Ansible (e.g. v2, v3)
11proxmoxer >= 2.0.0
This Ansible playbook also integrates Netbox, as my vSphere workflow did, and uses a common schema to simplify code re-use. There are a few quirks with the Proxmox playbooks:
- There's no module to grab VM Guest network information, but the API provides it, so I can get it with
uri
- Proxmox has a nasty habit of breaking Ansible with JSON keys that include
-
. The best way to fix it is with a debug action:{{ prox_network_result.json.data | replace('-','_') }}
- Proxmox's VM copy needs a timeout configured, and announces it's done before the VM is ready for actions. I added an
ansible.builtin.pause
step before starting the VM, and after (to allow it to boot)
1---
2- name: "Build VM on Proxmox"
3 hosts: localhost
4 gather_facts: true
5 # Before executing ensure that the prerequisites are installed
6 # `ansible-galaxy collection install netbox.netbox`
7 # `python3 -m pip install aiohttp pynetbox`
8 # We start with a pre-check playbook, if it fails, we don't want to
9 # make changes
10 any_errors_fatal: true
11 vars_files:
12 - "parameters.yaml"
13
14 tasks:
15 - name: "Debug"
16 ansible.builtin.debug:
17 msg: '{{ vm_data }}'
18 - name: "Test connectivity and authentication"
19 community.general.proxmox_node_info:
20 api_host: '{{ lookup("env", "PHOSTNAME") }}'
21 api_user: '{{ lookup("env", "PAPIUSER") }}'
22 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
23 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
24 register: prox_node_result
25 - name: "Display Node Data"
26 ansible.builtin.debug:
27 msg: '{{ prox_node_result }}'
28 - name: "Build the VM"
29 community.general.proxmox_kvm:
30 api_host: '{{ lookup("env", "PHOSTNAME") }}'
31 api_user: '{{ lookup("env", "PAPIUSER") }}'
32 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
33 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
34 name: '{{ vm_data.name }}'
35 node: '{{ vm_data.node }}'
36 storage: '{{ vm_data.hardware.storage }}'
37 newid: '{{ vm_data.id }}'
38 clone: '{{ vm_data.template }}'
39 format: '{{ vm_data.hardware.format }}'
40 timeout: 500
41 state: present
42 - name: "Wait for the VM to fully register"
43 ansible.builtin.pause:
44 seconds: 15
45 - name: "Start the VM"
46 community.general.proxmox_kvm:
47 api_host: '{{ lookup("env", "PHOSTNAME") }}'
48 api_user: '{{ lookup("env", "PAPIUSER") }}'
49 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
50 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
51 name: '{{ vm_data.name }}'
52 state: started
53 - name: "Wait for the VM to fully boot"
54 ansible.builtin.pause:
55 seconds: 45
56 - name: "Get VM information"
57 community.general.proxmox_vm_info:
58 api_host: '{{ lookup("env", "PHOSTNAME") }}'
59 api_user: '{{ lookup("env", "PAPIUSER") }}'
60 api_token_id: '{{ lookup("env", "PAPI_TOKEN") }}'
61 api_token_secret: '{{ lookup("env", "PAPI_SECRET") }}'
62 vmid: '{{ vm_data.id }}'
63 register: prox_vm_result
64 - name: "Report the VM!"
65 ansible.builtin.debug:
66 var: prox_vm_result
67 - name: "Fetch VM Networking information"
68 ansible.builtin.uri:
69 url: 'https://{{ lookup("env", "PHOSTNAME") }}:8006/api2/json/nodes/{{ vm_data.node }}/qemu/{{ vm_data.id }}/agent/network-get-interfaces'
70 method: 'GET'
71 headers:
72 Content-Type: 'application/json'
73 Authorization: 'PVEAPIToken={{ lookup("env", "PAPIUSER") }}!{{ lookup("env", "PAPI_TOKEN") }}={{ lookup("env", "PAPI_SECRET") }}'
74 validate_certs: false
75 register: prox_network_result
76 - name: "Refactor Network Information"
77 ansible.builtin.debug:
78 msg: "{{ prox_network_result.json.data | replace('-','_') }}"
79 register: prox_network_result_modified
80 - name: "Register the VM in Netbox!"
81 netbox.netbox.netbox_virtual_machine:
82 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
83 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
84 validate_certs: false
85 data:
86 cluster: '{{ lookup("env", "NETBOX_CLUSTER") }}'
87 name: '{{ vm_data.name }}'
88 description: 'Built by the GH Actions Pipeline!'
89 local_context_data: '{{ prox_vm_result }}'
90 memory: '{{ vm_data.hardware.memory }}'
91 vcpus: '{{ vm_data.hardware.cpus }}'
92 - name: "Configure VM Interface in Netbox!"
93 netbox.netbox.netbox_vm_interface:
94 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
95 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
96 validate_certs: false
97 data:
98 name: '{{ vm_data.name }}_intf_{{ item.hardware_address | replace(":", "") | safe }}'
99 virtual_machine: '{{ vm_data.name }}'
100 vrf: 'Campus'
101 mac_address: '{{ item.hardware_address }}'
102 with_items: '{{ prox_network_result_modified.msg.result }}'
103 when: item.hardware_address != '00:00:00:00:00:00'
104 - name: "Reserve IP"
105 netbox.netbox.netbox_ip_address:
106 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
107 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
108 validate_certs: false
109 data:
110 address: '{{ item.ip_addresses[0].ip_address }}/{{ item.ip_addresses[0].prefix }}'
111 vrf: 'Campus'
112 assigned_object:
113 virtual_machine: '{{ vm_data.name }}'
114 state: present
115 with_items: '{{ prox_network_result_modified.msg.result }}'
116 when: item.hardware_address != '00:00:00:00:00:00'
117 - name: "Finalize the VM in Netbox!"
118 netbox.netbox.netbox_virtual_machine:
119 netbox_token: '{{ lookup("env", "NETBOX_TOKEN") }}'
120 netbox_url: '{{ lookup("env", "NETBOX_URL") }}'
121 validate_certs: false
122 data:
123 cluster: '{{ lookup("env", "NETBOX_CLUSTER") }}'
124 tags:
125 - 'lab_debian_machines'
126 - 'lab_linux_machines'
127 - 'lab_apt_updates'
128 name: '{{ vm_data.name }}'
129 primary_ip4:
130 address: '{{ item.ip_addresses[0].ip_address }}/{{ item.ip_addresses[0].prefix }}'
131 vrf: "Campus"
132 with_items: '{{ prox_network_result_modified.msg.result }}'
133 when: item.hardware_address != '00:00:00:00:00:00'
Conclusion
Overall, the Proxmox API/playbooks are quite a bit simpler to use than the VMware ones. The proxmoxer
based modules are relatively feature complete compared to vmware_rest
, but the largest exception I found (examples not in this post) was that I could always fall back to Ansible's comprehensive Linux foundation to fill any gaps I needed to. It's a refreshing change.