NSX Advanced Load Balancer - NSX-T Service Engine Creation Failures: CC_SE_CREATION_FAILURE and Transport Node Not Found to create service engine

TL;DR

If you see either of these errors, check grep 'ERROR' /opt/avi/log/cc_agent_go_{{ cloud }} for the potential cause. In my case, the / character was not correctly processed by Avi's Golang client (facing vCenter).

The Problem

When trying to configure NSX ALB + NSX-T on my home lab, I am presented nothing but the following error:

CC_SE_CREATION_FAILURE

The Process

Avi Vantage appears to be treating this as a retriable error, attempting to deploy a service engine five times, which can be re-executed with a controller restart:

Avi Controller Logs

Oddly enough, vCenter doesn't report any OVA deploy attempts. The next thing to check here would be the vSphere content library:**

vSphere Content Library

So far, so good. vCenter knows where to deploy the image from.

Now here's a problem - Avi doesn't provide any documentation on how to troubleshoot this yet - so I did a bit of digging and found that you can bump yourself to root by performing a:

1sudo su

Useful note: Avi Vantage is running bullseye/sid with only 821 packages listed under dpkg -l | wc -l. They did do a pretty good job with pre-release cleanup, but there are still a few oddball packages in there. I'd give it a 9/10, I'd like to see X11 not be installed but am pleased to see only Python 3!

Avi's logs are located in:

1/var/lib/avi/log
2/opt/avi/log

what I found in alert_notifications_debug.log:

 1summary: "Syslog for System Events occured"  
 2event_pages: "EVENT_PAGE_VS"  
 3event_pages: "EVENT_PAGE_CNTLR"  
 4event_pages: "EVENT_PAGE_ALL"  
 5obj_name: "avi_-Avi-se-rctbp"  
 6tenant_uuid: "admin"  
 7related uuids ['avi_-Avi-se-rctbp']  
 8[2021-04-09 20:06:30,923] INFO [alert_engine.processAlertInstance:225] [uuid: ""  
 9alert_config_uuid: "alertconfig-938cf267-e20d-4d8e-a50a-21f0f5a5b633"  
10timestamp: 1617998694.0  
11obj_uuid: "avi_-Avi-se-rctbp"  
12threshold: 0  
13events {  
14  report_timestamp: 1617998694  
15  obj_type: SEVM  
16  event_id: CC_SE_CREATION_FAILURE  
17  module: CLOUD_CONNECTOR  
18  internal: EVENT_EXTERNAL  
19  context: EVENT_CONTEXT_SYSTEM  
20  obj_uuid: "avi_-Avi-se-rctbp"  
21  obj_name: "avi_-Avi-se-rctbp"  
22  event_details {  
23    cc_se_vm_details {  
24      cc_id: "cloud-022c7b90-f987-4b15-91bb-1f1405715580"  
25      se_vm_uuid: "avi_-Avi-se-rctbp"  
26      error_string: "Transport node not found to create serviceengine avi_-Avi-se-rctbp"  
27    }  
28  }  
29  event_description: "Service Engine creation failure"  
30  event_pages: "EVENT_PAGE_VS"  
31  event_pages: "EVENT_PAGE_CNTLR"  
32  event_pages: "EVENT_PAGE_ALL"  
33  tenant_name: ""  
34  tenant: "admin"  
35}  
36reason: "threshold_exceeded"  
37state: ALERT_STATE_ON  
38related_uuids: "avi_-Avi-se-rctbp"  
39level: ALERT_LOW  
40name: "Syslog-System-Events-avi_-Avi-se-rctbp-1617998694.0-1617998694-45824571"  
41summary: "Syslog for System Events occured"  
42event_pages: "EVENT_PAGE_VS"  
43event_pages: "EVENT_PAGE_CNTLR"  
44event_pages: "EVENT_PAGE_ALL"  
45obj_name: "avi_-Avi-se-rctbp"  
46tenant_uuid: "admin"

From the looks of things - Avi is talking with NSX-T before vCenter to determine appropriate placement, which makes sense.

Update and Root Cause

With the Avi 20.1.6 release, VMware has made a lot of improvements to logging! We're now seeing this error in the GUI (Ensure that "Internal Events" is checked:

Avi Events

Avi Event

Let's take a look at the new logging. Avi's controller system leverages a series of Go modules called "cloud connectors" dedicated to that specific interface. Each one has its own log file in``` /opt/avi/log/cc_

12021-07-04T20:20:42.801Z        ERROR   vcenterlib/vcenter_utils.go:606 [10.66.0.202][avi-mgt-vni-10.7.80.0/24] object references is empty  
22021-07-04T20:20:42.819Z        ERROR   vcenterlib/vcenter_utils.go:578 [10.66.0.202][avi-mgt-vni-10.7.80.0/24] object references is empty  
32021-07-04T20:20:42.822Z        ERROR   vcenterlib/vcenter_se_lifecycle.go:432  [10.66.0.202][QH] [10.66.0.202] Network 'avi-mgt-vni-10.7.80.0/24' matching not found in Vcenter  
42021-07-04T20:20:42.822Z        ERROR   vcenterlib/vcenter_se_lifecycle.go:891  [10.66.0.202] [10.66.0.202] Network 'avi-mgt-vni-10.7.80.0/24' matching not found in Vcenter

Now, this vn-segment does exist in vCenter, so I tried the "non-escaped shell character" knowledge from years of Linux/Unix administration and reformatted it to avi-mgt-vni-10.7.80.0_24.

Since we don't get a Redeploy (please VMware!) button, I restarted the controller and all SE deployments succeeded after that.