Friday, March 15, 2019

Forward Error Correction, a story about Generation 14 PowerEdge and 25 Gigabit connectivity

25 Gigabit implementations

First of all - anyone who assumes they think Layer 1 is simple is wrong.

That being said, 25G/50/100G/QSFP28 services are different beyond simply being 2.5x faster than 10G. 802.3by (or 25 Gig for those who use it):

  • Full-Duplex is mandatory.
  • Energy Efficient Operation
  • Stackable Lanes supporting speeds of up to 28Gbits/s 
  • For those who love it, Twinax maxes out at 5 meters for now (
  • Currently, cost for 25G/100G silicon appears to be less than for 10G/100G at the switch level.

What has NOT changed

  • BER minimums are still 10^12
  • All existing 802.1* protocols remain supported
That being said, I've been working to implement 25G to the server for quite some time now, and we waited with bated breath as the new servers (sporting bcnxnet 2x25G NICs) booted up...

and proceeded not to establish any link-level connectivity.

Well, we followed the usual suspects, attempting to statically negotiate speed-duplex (which is probably a platform oddity) to no avail. 

As it turns out - Forward Error Correction is the culprit. Upon reviewing the IEEE's docs on 802.3by, we found this gem, indicating the difficulties with negotiating different FEC modes:

Clause 73 outlines a set of bits for FEC auto-negotiation that would allow (over 5 bits) signaling to establish a same-same connection for agreement on which mode to use - keep in mind that any active connection (all optics, twinax over 5 meters) will require some form of FEC to detect whether errors will probably occur on a link:
F0: 10G FEC Offered
F1: 10G FEC Requested
F2: 25G RS-FEC Offered (ideal)
F3: 25G RS-FEC Requested
F4: 25G Base-R (Fire Code) FEC requested

This is important for preventing downstream failures - now that we're transmitting data at considerably higher speeds, but since 802.3by has been released as recently as 2016 (where RS-FEC came out in 2017) support for various modes can be a bit lopsided. Here's the order of preference with a reliability bias - invert the list if latency is the primary goal / you use really good cables:
  1. RS-FEC
  2. FC-FEC
  3. FEC Disabled
Currently, Generation 14 Dell Poweredge appears to support all modes, but defaults to "disabled" and completely fails to auto-negotiate. No matter what, using the Broadcom NICs onboard, you will need to consciously select an option here, and then apply it to your switch.

In addition, early-generation 802.3by switches like Cisco's Nexus EX will not support RS-FEC on single-lane modes, but will support in multi-lane transceivers:

This can also be resolved by buying newer generation switches (FX+), but all generations appear to auto-negotiate with no issues within the switch-to-switch realm.

What is FEC?

Well, the wikipedia article is a pretty good start ( but is awfully vague. Long story short, you have the option of adding about 80-250 billionths of a second in latency to essentially achieve a "what-if" analysis on a links apparent reliability. This is great, especially with twinax, where bit errors are a bit more common than with fiber optics. FEC can also provide feedback on bit errors between destinations, allowing it to "train" or "self-heal" links - allowing for much higher link reliability.

What this means to me

In this case, the following design impacts should be made:
  • If it's important, use multi-lane slots for it: 
    • If you're egressing a fabric, you should use QSFP28 transceivers if cost allows. This will provide RS-FEC where it counts
    • If you have spine switches, use QSFP28 transceivers.
  • If you're buying now, read the product sheets for both your servers and your switches to ensure that RS-FEC is supported, and use optical cabling

Sunday, March 3, 2019

Gotchas with NSX-T 2.4

NSX-T 2.4 is a major software upgrade with a multitude of new features, listed here:

The documentation for this release is not very mature, so I've compiled some gotchas I found while installing NSX-T 2.4 below:

  • Ensure that when you configure Host Transport nodes (such as ESXi) that all transport zones you need are provisioned on the host! The node summary should have a minimum of two Transport zones, one for underlay and one for overlay:
  • Don't forget your uplink VLAN! The uplink VLAN must be configured under "Advanced Networking & Security" -> Networking -> Switching, and should participate in your underlay transport zone:
  • NSX Controllers are no more. This functionality is merged into the NSX manager - which now has clustering support. You'll need to configure a vIP for the manager as well, for these reasons.
  • NSX Managers need more RAM. VMWare Recommends 24 GBytes of Memory per manager if you have less than 64 hosts - I'm running stable with 16 and 1 host, so consider 24GB the minimum.

Running a serial console server over ESXi

Since I'm building a hybrid systems/networking lab, one of the key features I'll need is a serial console server to administer the lab switches. There are a few options here:

  • Find an old Cisco Router and some async octal cables (Rare, takes up rack space)
  • Purchase a serial console server like MRV, Perle, Internetwatchdogs, etc ($$$)
  • Build a RPi as the console server (current solution, consumes 1 outlet)
  • Build a VM, and connect the USB-to-Serial Adapter
The last one is interesting, here's why. I have an ansible server that I intend to use for most patching/administration tasks, and to trial out certain aspects of network automation, and ansible lists a very interesting feature, proxies:
I could plausibly list the ansible VM's loopback address as a proxy, allowing me to use it to automate early-stage network provisioning without network connectivity. I know it's a petty thing to want to automate, but that particular aspect of network devices provisioning is pretty tedious, you have to:
  1. Upgrade to your baselined code revision
  2. Configure basic networking
  3. Download baseline config, and then customize it
  4. Restart to new config
#1 is a pretty slow task, and I'd like to automate it - it'd be great to let ansible babysit switches while they provision instead of having to be right there building on it the entire time. These are pretty simple tasks for most route-switch platforms - typically only requiring a binary copy and a reboot or two.

Anyhow,  let's get down to configuring the basics. I'm performing this from the vCenter 6.7 GUI, so YMMV on user interfaces. All you have to do is plug in your USB-to-Serial adapter, and then add it to the VM as a "Host USB Device." I'd recommend FTDI-type adapters, they don't typically require any driver install to work on either ESXi or Linux.

Now, let's see if they show up:

ansible:~ # ls /dev/ttyU*
/dev/ttyUSB0  /dev/ttyUSB1  /dev/ttyUSB2  /dev/ttyUSB3
We're all set! I typically use screen as a direct console emulator, but they all more or less do the same thing. At this point we're really just trying to test the console ports to see if they work:

ansible:~ # screen /dev/ttyUSB0

         --- System Configuration Dialog ---

Would you like to enter the initial configuration dialog? [yes/no]:

ansible:~ # screen /dev/ttyUSB1

User Access Verification


ansible:~ # screen /dev/ttyUSB2

Would you like to terminate autoinstall? [yes]: yes

ansible:~ # screen /dev/ttyUSB3


ansible:~ # killall screen
Looks like we're fully functional on all serial ports - I have 3 unprovisioned WS-C3560-24-TS-E for future lab use. The last commmand was to ensure that the proxy software wouldn't have to compete with screen for ownership of a serial device.

We'll be installing ser2net next - it only supports telnet, but you can tunnel SSH in a prod environment. Honestly, if you want this in your work environment it'd be much better to use a dedicated console server - 48 ports will net you less than a Dell R430, and can connect to phone lines. They're worth it.

ansible:~ # zypper in ser2net
Loading repository data...
Reading installed packages...
Resolving package dependencies...

The following NEW package is going to be installed:

1 new package to install.
Overall download size: 92.3 KiB. Already cached: 0 B. After the operation, additional 200.1 KiB will be used.
Continue? [y/n/...? shows all options] (y): y
Retrieving package ser2net-3.5-2.2.x86_64                                          (1/1),  92.3 KiB (200.1 KiB unpacked)
Retrieving: ser2net-3.5-2.2.x86_64.rpm ...........................................................................[done]
Checking for file conflicts: ----------------------------------------------------------------------------------------[done]
(1/1) Installing: ser2net-3.5-2.2.x86_64 ----------------------------------------------------------------------------[done]
Then we create a config file:

#  ::::

BANNER:banner1:TCP port \p device \d\r\n
BANNER:banner2:TCP port \p device \d\r\n
BANNER:banner3:TCP port \p device \d  serial parms \s\r\n
OPENSTR:open1:Open str\r\n
# Default value settings.  The given values are the defaults.  For non
# boolean values the possible values are given above.
#** serial device and SOL **
# speed: standard speeds shown above
# databits: 5,6,7,8
#** serial device only **
# stopbits: 1,2
# parity: none, even, odd
And we're set! Systemd will automatically start ser2net with the VM.

Minemeld installation, continued

I cheated/pivoted a little bit - decided to simulate a bit more closely what I'd be using at work. I bootstrapped a CentOS VM and followed the instructions in:

$ sudo yum install -y wget git gcc python-devel libffi-devel openssl-devel zlib-dev sqlite-devel bzip2-devel
$ wget
$ sudo -H python
$ sudo -H pip install ansible
$ git clone
$ cd minemeld-ansible
$ ansible-playbook -K -i, local.yml
$ usermod -a -G minemeld  # add your user to minemeld group, useful for development

Everything worked fine - I had to retry the playbook once to get it to run, but the install playbook even enabled/started the requisite services. I'd highly recommend this approach over the OVA - it took me ~ 30 minutes in total to get Minemeld up and running in my lab, including the CentOS ISO download.

Anyone else who is doing this may find it useful to know that the usermod above doesn't grant you login access to minemeld - it has its own credential set. Default credentials are admin|minemeld.

My next objective will be to integrate with my lab firewall using EDLs. Here's a preview of it running without any custom miners - eventually I'd like to mine NSX-T's manager to share object groups between systems.

VyOS and other Linux builds unable to use `vmxnet3` or "VMware Paravirtual SCSI" adapter on vSphere

Have you seen this selector when building machines on vSphere? This causes some fairly common issues in NOS VMs, as most don't really kn...