Author Archives: James Cruickshank

vSAN Default Storage Policy Per vSAN Datastore

Everyday is a school day!!

I recently asked the vSAN vExpert slack channel the following: “Is there a way to set a default storage policy for a specific vSAN cluster? Use case – shared vCenter server with 10 hybrid vSAN clusters and 1 “private” customer with dedicated cluster running AF vSAN. The Private customer wants to use RAID6 but their deployment method just now does not allow the selection of a storage policy. We can’t change the default policy as the other 10 hybrid clusters are using this (and also don’t have a way to select a policy during deployment).”

Slightly embarrassingly for me that I didn’t know this but Steve Kaplan (@stvkpln) told me how to do it!

If you browse to the vSAN datastore object then Manage then general you can set the default policy for that datastore. Simples!

vRNI 3.3 Cluster Upgrade Process

When upgrading a vRNI cluster you currently have to follow this KB from VMware:

This post will is just a quick run through of the process and how it went for me.

First step is to stop the services running on each platform or proxy appliance. To do this we just ssh to the appliance (username: consoleuser and password ark1nc0ns0l3 <- these are publicly available on the cli guide for vRNI – here) then “services stop”

Continue reading

CMSSO-UTIL – Failed Service Registration

All credit for this post belongs to one of my colleagues called Imran Mughal, a very talented vSphere engineer!

We recently came across an issue during some site migration work on our PSC’s. The scenario which drove our PSC site migration work was the fact that we noticed all of our vCenter Servers were trying to authenticate users via a single load balanced pair of PSCs regardless of physical location instead of the PSCs in the datacentre local to the vCenter. This was due to us only using a single site name that covered 4 separate datacentre locations. All of our PSCs regardless of which physical location formed part of this single site.

As we already had live service running on out vCenters we decided to re-direct vCenters to another PSCs and rebuild in pairs (as there is currently no way to manually change the site name or create new sites over an existing deployment).

This was done using a combination of these two KB articles:

KB2131191 & KB2113917

Continue reading

NSX – ESG ECMP and remembering to disable the local firewall!

I ran into a bit of a problem with an NSX deployment I was recently working on… sadly human error caused the issue but as the old saying goes “better to find problems during testing than in production” (I just made that up), or “DCYCJ” (Double Check Your Config James).

User got in touch after deploying a simple web server saying they could only intermittently connect to their site from their desktop.
Continue reading


I’ve already built out the ESG that will serve as my DHCP server, check out my ESG Basics post for configuration steps, this post will focus solely on configuring the DHCP service on an existing ESG.

To configure DHCP, go to the ‘Networking and Security’ pane in the vSphere web client, then go to ‘NSX Edges’ and double click on the Edge instance you wish to configure DHCP on. Select Manage > DHCP:

Continue reading

NSX – Edge Service Gateway (Basics)

Deploying an ESG (Edge Service Gateway) starts off in the same way as a DLR (see my DLR basics post). The ESG is the next layer above a DLR and acts as the perimeter to the “real” world. The ESG provides tonnes of functionality and this is were I found the biggest leap from being a traditional VMware infrastructure (think vCenter, ESXi, VSAN, dvS, dvPortgroups, VMs etc. etc.) to becoming an SDDC engineer.

The ESG can do the following (I’m hoping to break all these functions down into posts over the next few months):

  • Firewall – Supported rules include IP 5-tuple configuration with IP and port ranges for stateful inspection for all protocols.
  • NAT – Separate controls for Source and Destination IP addresses, as well as port translation.
  • DHCP – Configuration of IP pools, gateways, DNS servers, and search domains.
  • Site-to-Site Virtual Private Network (VPN) – Uses standardized IPsec protocol settings to interoperate with all major VPN vendors.
  • L2 VPN – Provides the ability to stretch your L2 network.
  • SSL VPN-Plus – SSL VPN-Plus enables remote users to connect securely to private networks behind a NSX Edge gateway.
  • Load Balancing – Simple and dynamically configurable virtual IP addresses and server groups.
  • High Availability – High availability ensures an active NSX Edge on the network in case the primary NSX Edge virtual machine is unavailable.

Let’s get down to business… Web Client > Networking & SecurityNSX Edges, once here click the green cross:
Continue reading

NSX – Distributed Logical Router (Basics)

In this post I will step through the basic deployment steps of a Distributed Logical Router (DLR)… but first what is a DLR?

NSX provides the ability to do traffic routing (between 2 separate Layer 2 segments, for example a VM on and a VM on within the hypervisor without ever having to send the packet out to a physical router. For example, if the application server VM in vlan 101 need to talk to the DB server VM in vlan 102, the packet needs to go out of the vlan 101 tagged port group via the uplink ports to a Layer 3 enabled physical switch which will perform the routing and send the packet back to the vlan 102 tagged portgroup, even if both VM’s reside on the same ESXi server (this is referred to as “hairpin” traffic).

This new ability to route within the hypervisor is made available with DLRs and ESGs (Edge Service Gateways, which we will cover in a later post):

  • East-West routing =  Distributed Logical Router (DLR)
  • North-South routing = NSX Edge Gateway device

Let’s get down to the coal face… Web Client > Networking & SecurityNSX Edges, once here click the green cross:

Continue reading

NSX – Logical Switch

Next up on the NSX build out is creating logical switches! A logical switch is a distributed port group on a distributed switch. So why logical? Because it gets a unique VNI (VXLAN Network Identifier) to overlays the L2 network.

Every time you create a Logical Switch you are creating a VXLAN, a great way to think about the power of a Logical Switch is to consider how much time and paperwork is required to add a new VLAN to ESXi hosts (in a large enterprise this can take days). With NSX I can now do this in minutes.

To create a Logical Switch go to Web Client > Networking & SecurityLogical Switches then click the green plus icon.

Continue reading

NSX – Transport Zone

Next up is creating a Transport Zone! Transport Zones are a way to define which clusters/hosts are be able to see and participate in the virtual network that are being configured. Its like a container that houses NSX Logical Switches along with their details which is then assigned to a collection of ESXi hosts that should be able to communicate with each other across the physical network infrastructure.

To configure a transport zone click on the Transport Zones sub-tab, then click the green plus button to add a new transport zone.

  • Name – Example transport zone names I have seen in the field are “Management”, “Edge Services”, “Resources”, “Compute”, “CustomerName” etc.
  • Description – Little description of the transport zones function (can be left blank like I have)
  • Control Plane Mode – The method that VXLAN will use to distribute information across the control plane. Here are the official details as per the NSX Installation Guide:
    • Multicast: Multicast IP addresses on physical network is used for the control plane. This mode is recommended only when you are upgrading from older VXLAN deployments. Requires PIM/IGMP on physical network.
    • Unicast : The control plane is handled by an NSX controller. All unicast traffic leverages headend replication. No multicast IP addresses or special network configuration is required.
    • Hybrid : The optimized unicast mode. Offloads local traffic replication to physical network (L2 multicast). This requires IGMP snooping on the first-hop switch, but does not require PIM. Firsthop switch handles traffic replication for the subnet.
  • Clusters – Pick the clusters that should be added to the transport zone.