For VCF (4.X) on VxRail, the sequence of steps should be first and foremost in our mind when shutting down and then starting up a VCF on VxRail environment. Not following the correct procedure could result in corruption within the environment and the loss of ability to start-up the environment.
Knowing that shutting down components in the wrong way can lead to a whole lot of pain, I went searching through my usual go-to spots for a step-by-step procedure guide I could leverage. My search came up short. The most recent power control documentation I could find for VCF on VxRail was for VCF 3.10 and while useful for VCF 3.11, the procedure for VCF 4.X is very different to VCF 3.11.
To fill this gap, this post covers the sequencing of tasks that we must complete to shut down and start-up your VCF 4.X on VxRail environment. This will help you start up your environment safely and without corruption post shutdown. In addition, I have included step-by-step procedure documentation and videos where I demonstrate how these tasks were completed within our lab environment running VCF 4.2.1 on VxRail 7.0.131.
But first, let’s take a look at the typical components that you will see in a VCF 4.X on VxRail environment.
2. VCF (4.X) on VxRail Components
The below diagram shows an overview of the components that make up a typical VCF on VxRail standard architecture, with one management domain (minimum 4 hosts) and one VI workload domain (minimum 3 hosts).
We will always start by shutting down our workload domains before we shut down the management domain. And vice versa, when starting up your environment, you must start your management domain before you can start-up your workload domains.
Leveraging this diagram, in Section 3 I have listed the sequence of events that must be followed when completing the power control procedures for the VCF (4.x) on VxRail components shown.
Note: While the vCenter Server VM and NSX Local Managers for the workload domain sit within the management domain cluster, the shutdown and start-up of these components will be completed as a part of the power control procedures of the workload domain, see Section 3.1.2 and 3.2.3 below.
3. Sequence of Events
Here is the high-level order of steps that you must follow to ensure a safe shutdown of your environment.
3.1.1 Complete all prerequisite tasks
I. Complete back-ups of all your management components.
II. Note down all hostnames and IP address of all ESXi hosts in the management and workload domains.
III. Ensure switch configurations are available external to your VCF System.
IV. If a vSphere Storage APIs for Data Protection (VADP) based backup solution is running on the management clusters, verify that the solution is properly shut down
by following the vendor guidance.
V. Shut down all virtualised customer workloads.
VI. Migrate (via vMotion) the management domain vCenter to the first ESXi host in the management domain cluster.
3.1.2 Shutdown the Workload Domain
I. Site Recovery Manager (SRM) for the workload Domain
II. vSphere Replication for the Workload Domain
III. NSX Edge nodes
IV. NSX Local Managers
V. vSphere Cluster Services VMs, VxRail Manager VM, vSAN and ESXi hosts (VxRail manager plug-in will complete)
VI. vCenter Server (WLD only)
3.1.3 Shutdown the Management Domain
I. vRealize Automation cluster
II. vRealize Operations Manager cluster
III. Clustered Workspace ONE Access
IV. VMware vRealize Suite Lifecycle Manager
V. SRM for the management domain
VI. vSphere Replication for the management domain
VII. vRealize Log Insight cluster
VIII. Standalone Workspace ONE Access
IX. NSX Edge nodes for the management domain
X. NSX Local Managers
XI. SDDC Manager
XII. VxRail Manager
XIII. vSphere Cluster Services VMs
XIV. vCenter Server (mgmt.)
XV. ESXi hosts and vSAN
Here is the high-level order of steps that you must follow to ensure a safe start-up of your environment.
3.2.1 Complete all prerequisite tasks
I. Verify that external services such as Active Directory, DNS, NTP, SMTP, and FTP or
SFTP are available
II. If a vSphere Storage APIs for Data Protection (VADP) based backup solution is
deployed on the default management cluster, verify that the solution is properly
started and operational according to the vendor guidance
3.2.2 Start-up Management Domain
I. ESXi hosts and vSAN
II. vCenter Server for management domain
III. vSphere Cluster Services VMs
IV. VxRail Manager
V. SDDC Manager
VI. NSX Manager Nodes for management domain
VII. NSX Edge nodes for management domain
VIII. Standalone Workspace ONE Access
IX. vRealize Log Insight cluster
X. vSphere Replication for the management domain
XI. SRM for the management domain
XII. VMware vRealize Suite Lifecycle Manager
XIII. Clustered Workspace ONE Access
XIV. vRealize Operations Manager cluster
XV. vRealize Automation cluster
3.2.3 Start-up Workload Domain
I. WLD vCenter Server
II. ESXi hosts, VxRail Manager VM, vSAN and vSphere Cluster Services (VxRail
III. NSX Local Managers
IV. NSX Edge Nodes
V. vSphere Replication for the Workload Domain
VI. Site Recovery Manager (SRM) for the workload Domain
VII. Virtualized customer workloads
4. Lab Environment – Demos & Procedure Documentation
I was fortunate to have access to a lab environment that was set-up where we had VCF 4.2.1 running on VxRail 7.0.131. The exact components installed in this lab environment are shown in Diagram.2 below, with four ESXi hosts in the management domain cluster and three ESXi hosts for the workload domain cluster.
Access to this lab allowed me to test the required sequence of events required for these components, and to prepare step-by-step procedure documentation and videos that can be used as a guide for the safe shutdown and start-up of a similar VCF on VxRail environment.
This step-by-step procedure documentation and video demos can be found below.
4.1 Step-by-step Procedure Documentation
1. Shutdown VCF (4.x) on VxRail (7.x) – Procedure Guide
2. Start-up VCF (4.x) on VxRail (7.x) – Procedure Guide
1. Shutdown VCF (4.x) on VxRail (7.x) - Demo
2. Start-up VCF (4.x) on VxRail (7.x) - Demo
Note: The procedure documentation and video demos included in this post includes guidance
for the shutdown and start-up of the components installed in our lab environment only, see diagram 2 above for reference. You can find guidance on how to shut down and start-up additional VCF on VxRail components, such as vRealize components, from www.vmware.com website.