A VxRail customer of mine recently asked the question whether they could deploy NSX-T on an existing brownfield cluster deployment. The system is in production and running workload. There was no Day 0 requirement for NSX-T at time of deployment but requirements have changed in the interim. My stance normally would be to recommend the fully engineered VCF on VxRail path which will automate the NSX-T deployment. Clearly this is a heavier lift and may require a full logical rebuild of the environment to stand up the Management domain etc etc. This particular customer wasn't quite ready for the VCF path just yet. So what are the options?
Bottom line , whilst requiring careful planning, deploying NSX-T in a VxRail brownfield environment is not as onerous a task as one would think. There is no requirement to redeploy the Vxrail cluster. NSX-T will happily consume the pre-existing VDS that was stood up by Vxrail manager. The remainder of this post will provide a quick overview of how the Geneve Overlay is configured. The intent here, is that you can consume this in 15 minutes or so ( 6 minutes may be a bit optimistic - clickbait !) and use this as a primer to dig deeper into the official VMware and Dell documentation (I don't want you to break anything on my watch !). I've provided a link at the bottom of this post where you can find more detail. For brevity purposes i have left out some of the low level configuration, e.g anti-affinity rules for NSX-T manager etc. The goal here is to concentrate on the big picture.. Obviously in a production environment we will be checking these items off.
Next month , I'll follow up with how we configure the NSX-T Edge for VxRail. There is a little bit more to this, but the hope is that we could cover this in 15 minutes also.
We start off with a standard 4 Node Vxrail cluster (E660F). At configuration time we picked the 2 X 25GB Network profile. The first few steps are straightforward:
Have your network admin ( hopefully you!) configure the overlay/tep VLAN on both TOR switches and add to the trunk interfaces facing each of the VxRail hosts. At this point you don't need to configure anything on Vxrail or vSphere. In this instance its VLAN 704.
Locate the NSX-T manager OVF from VMware.com that is compatible with the version of vSphere you are running. We are going to run NSX-T ver 126.96.36.199 on vSphere 7.0.3.
Ensure that you have all your DNS records up to date and your NTP service, wherever that is configured is up and running.
Deploy the first NSX-T Manager via the OVF onto one of the hosts in the cluster.
Once up and running, browse to the first NSX-T Manager you have just deployed and the vCenter server as a Compute Manager. This is a relatively simple process. Navigate to SYSTEM -> FABRIC -> COMPUTE MANAGERS -> ADD COMPUTE MANAGER
Once your vCenter server has been registered and the connection status is green then you leverage NXS-T Manager to deploy the secondary and tertiary NSX-T Managers.
Now you have primed the environment to stand up the NSX-T Fabric with GENEVE Overlay. Before we go any further, let's have a look at the state of the cluster Virtual Distributed Switch (VDS), as deployed by Vxrail Manager. Note the inherited default switch name from the initial deployment. We will come back to this at the end of the post.
Next up we want to configure the NSX-T Policy (Transport Node) and apply these to each of the hosts in the Vxrail cluster. Again, I'll point you to the official VMware documentation (Link at the bottom of the post), but the logic is broadly as follows:
Create an 'Overlay Transport Zone'. There is a default zone setup already but in this instance we will create a net new Transport Zone called 'VxRail-Compute-Transport-Overlay'. The Overlay Transport Zone is a marker/setting that is applied to a Transport Node (ESXi Host/KVM Host), that defines the extent of which a Layer 2 Virtual Network (VNI/Segment) is reachable. We will use a Transport Node policy to attach this Transport Zone to each of the hosts in our cluster.
Create an 'Uplink Profile'. Again there are multiple default options based on the teaming policy you wish to use. For clarity we will create a new Uplink profile named 'VxRail-W01-Compute'. The Uplink Profile is a policy that allows us set the:
Teaming Policy (In this instance Loadbalance_SRCID).
Define the number and name of the active uplinks in the teaming policy. Note, these uplinks are logical constructs, so we can call them anything we wish. In this instance we call them 'U-1' and 'U-2'. Note: the number of Active Uplinks in the team will determine the number of TEP (Tunnel Endpoint) interfaces that are deployed once the policy is deployed. In this instance we will have 2 TEP interfaces.
Define the Transport Vlan. The VLAN over which we will build our GENEVE Tunnels between our hosts. Remember we had this plumbed at a physical switch level already. This will be VLAN 704
Create the 'Transport Node Profile'. We will use the Transport Node Profile to collate the following policy information into a singular policy that will be applied to our Transport Nodes, this includes the following:
The VMware VDS we wish to use. We have only one choice here, the default VDS created by VxRail Manage 'VMware HCIA Distributed Switch'
The Transport Zone we wish to apply to the cluster. We will use the one we configured previously VXRAIL-Compute-Transport-Overlay
The Uplink Profile we wish to attach. Using the one just configured VxRail-W01-Compute
Configure how we wish to allocate an IP address to each of the TEP interfaces. In this instance we will use a static pool configured in NSX-T Manager derived from the IP space allocated to Vlan 704. We have created a pool named Host-TEP
Finally, and probably most importantly, we map the logical uplinks we created previously ( U-1, U-2) to the physical uplinks present on the DVS we have discovered (uplink1 and uplink2)
At first glance there does seem to be quite a few moving parts and elements to configure, but ultimately we are just creating a logical policy in NSX-T and creating a relationship with the existing VDS configured on the vsphere cluster. Once we see it through that lens, the process becomes relatively intuitive. The following gallery, steps through what we have done, hopefully this will help provide some clarity.
Apply the Policy: Transport Node Profile
The next step is relatively straightforward as long as everything is configured correctly (Fingers crossed). It is simply a matter of attaching our new Transport Node Policy VxRail-TN-Policy to our VxRail Cluster. This initiates the deployment of the various control and data plane software onto the individual hosts and configures the VDS with NSX capability.
If you recall earlier we took a quick snapshot of the VDS as deployed by VxRail Manager. During the NSX Fabric configuration we leveraged this VDS for the overlay. We can see this at a vCenter level now.
Of course we do need to dig a little deeper into what has been configured and do some high level functional tests to ensure the newly deployed Overlay fabric is functioning. To do this we will do some reachability testing between each of the deployed Tunnel Endpoints (TEPS) across each host. Once we have TEP to TEP reachability over the physical fabric we can be pretty confident all is well.
During configuration, NSX-T Manager applied an IP address from the internal pool to each of the Transport Nodes TEP interfaces ( 2 per ESXi host) as per the following:
One further point to note is that VMK10 and VMK 11 are deployed on a separate TCP/IP stack than the other VMK interfaces on the host ( management, vmotion vsan etc), but do reside on the same VDS. They have also inherited the switch MTU of 1600, this is enough to facilitate the additional GENEVE header overhead ( Although I do like to bump all the way up to 9216 where possible, 1500 being standard Ethernet MTU, 1600 being a Baby Giant Frame and above that all the way to 9216 being a Jumbo frame)
After enabling SSH on each host, we will putty into the first host. From here we will ping each other host TEPs in the environment with:
Source address of VMK10 and VMK 11 via the TCP/IP stack VXLAN ( even though this a GENEVE tunnel)
Frame size of 1550 ( excluding GENEVE overhead)
Do Not Fragment bit set to on. ( we don't want a false positive when testing the MTU)
Log on to each host and esxcli to identify each TEP address (VMK 10/11).
Execute vmkping to each other host TEP (VMK 10/11 on all other hosts)
As mentioned at the top of this post, please refer to the more detailed official documentation (especially if doing this in production!)
Now that we have the overlay built we need to start thinking about how we can reach the external world via the Edge cluster. Stay tuned next month for another post on how we will achieve this leveraging the exact same physical infrastructure. So no requirement for additional Vxrail hosts...
Opinions expressed in this article are entirely my own and may not be representative of the views of Dell Technologies.