Monthly Archives: June 2015

VSAN Network Partition

HA works differently on a VSAN cluster than on a non-VSAN cluster.

  • When HA is turned on in the cluster, FDM agent (HA) traffic uses the VSAN network and not the Management Network. However, when a potential isolation is detected HA will ping the default gateway (or specified isolation address) using the Management Network.
  • When enabling VSAN ensure vSphere HA is disabled. You cannot enable VSAN when HA is already configured. Either configure VSAN during the creation of the cluster or disable vSphere HA temporarily when configuring VSAN.
  • When there are only VSAN datastores available within a cluster then Datastore Heartbeating is disabled. HA will never use a VSAN datastore for heartbeating as the VSAN network is already used for network heartbeating using the Datastore for heartbeating would not add anything,
  • When changes are made to the VSAN network it is required to re-configure vSphere HA.

ESXi Isolation – VM with no underlying storage

Failure_4
Continue reading

vSphere 6 Deployment Topologies

vCenter Server with Embedded PSC and Database

The embedded PSC is meant for standalone sites where vCenter Server will be the only SSO integrated solution. In this case replication to another PSC should not be required and is not possible.

  • Device is a single point of failure.
  • Supports Windows and VCSA (vCenter Server Appliance) based deployments
  • Replication between PSCs not required
  • Multiple standalone instances supported
  • Sufficient for small scale deployments
  • Not suitable for use with other VMware products (vRA, NSX etc.)
  • Easy to deploy and maintain

design1

Continue reading

VSAN Physical Disk Pre-Partitioned

Recently I added an ESXi host to a VSAN cluster (HP DL380 G9 with 2 x 800GB SSD and 6 x 4TB Magnetic) however when creating the disk group only 2 disks were available for use.

After some investigation it turns out the other disks already had partitions, this was confirmed by running the following command:

ls /vmfs/devices/disks

Running the PartedUtil getptbl /vmfs/devices/disks/naa.5000xxxxxxxxx command fails with the error: “Error: Can’t have a partition outside the disk!”

To resolve this (and allow the disks to be added to a VSAN disk group) I ran the following:

partedUtil setptbl /vmfs/devices/disks/naa.5000xxxxxxxxxxx msdos

Then reboot the host.

***This will destroy any data already on the disks***

VSAN Misconfiguration Detected

Always validate the network configuration. The VSAN Misconfiguration detected error is by far the most common error seen when configuring VSAN. Normally this means that either the port group has not been successfully configured for Virtual SAN or multicast has not been set up properly.

VSAN Misconfiguration Detected

VSAN Misconfiguration Detected

On Cisco switches, unless an IGMP Snooping Carrier has been configured OR IGMP snooping has been explicitly disabled on the ports used for Virtual SAN, configuration will generally fail. In the default configuration it is simply not configured, and therefore, even if the network admin says it is configured properly it may not be configured at all, double check it to avoid any pain.

Memory States

Host memory is a limited resource. VMware vSphere incorporates sophisticated mechanisms that maximize the use of available memory through page sharing, resource-allocation controls, and other memory management techniques. However, several of vSphere Memory Over-commitment Techniques only kick-in when the host is under memory pressure.

Memory States

Active Guest Memory

Amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages so it is what the VMkernel believes is currently being actively used by the VM.

Continue reading

ESXi and VM CPU Performance Issues

The following is a description of some common ESXi and VM CPU Performance Issues:

High Ready Time

Ready Time above 10% could indicate CPU contention and might impact the Performance of CPU intensive application. However, some less CPU sensitive application and virtual machines can have much higher values of ready time and still perform satisfactorily.

High CoStop (CSTP) Time

CoStop time indicates that there are more vCPUs than necessary, and that the excess vCPUs make overhead that drags down the performance of the VM. The VM will likely run better with fewer vCPUs. The vCPU(s) with high CoStop is being kept from running while the other, more-idle vCPUs are catching up to the busy one.

Continue reading

vCPU States

vCPUs are always in one of four states (vCPU States):

CPU_States

WAIT – This can occur when the virtual machine’s guest OS is idle (Waiting for Work), or the virtual machine could be waiting on vSphere tasks.  Some examples of vSphere tasks that a vCPU may be waiting on are either waiting for I/O to complete or waiting for ESX level swapping to complete. These non-idle vSphere system waits are called VMWAIT.

Continue reading

Removing VMFS Datastores from ESXi5.5 with VSAN

ESXi 5.5 and above stores coredumps on a datastore attached to the host, it can also create a vsantraces directory. Both of these can lock the datastore and prevent it from being deleted.

To check for and remove the coredump file do the following:

esxcli system coredump file list Path                               
Active  Configured       Size
-------------------------------------  ------  ----------  ---------
/vmfs/volumes/xx/vmkdump/xxx.dumpfile   false       false  702545920
/vmfs/volumes/xxx/vmkdump/xxx.dumpfile    true        true  702545920
/vmfs/volumes/xxx/vmkdump/xxx.dumpfile   false       false  702545920

The output shows that there are 3 dump files which are blocking the datastore. Only the owning ESXi host can disable and delete them, so you have to find out which ESXi is responsible for each file:

Continue reading