Objective 4.2 – Optimize Virtual Machine resources

Tune Virtual Machine memory configurations

Along with the allocation of memory resources and setting limits, reservations and shares, there are a few other things to note:

Memory Hot Add: The Hot Add feature allows you to increase the memory allocation for a virtual machine (on supported Guest OS) whilst it is powered on. It is only possible to do this, however, if you have already enabled Memory Hot Add in the virtual machine’s settings. This has to be done whilst the virtual machine is powered off:

TuneMem01

Swap File Location: You can change an individual virtual machine’s swap file location by changing the option in the virtual machine’s settings:

TuneMem02

 

Associate Memory Allocations with a NUMA Node: You can set it so that a virtual machine gets all its future memory allocations from a single NUMA node.

This would show as “advanced memory” in the resourced tab but my lab kit isn’t very good (1 socket) so can’t setup NUMA. 🙁

Tune Virtual Machine networking configurations

  • Check you’re using the latest NIC driver both for the ESX host and the guest OS (VMTools installed and VMXNET3 driver where possible)
  • Check NIC teaming is correctly configured
  • Check physical NIC properties – speed and duplex are correct, enable TOE (TCP Offload Engine) if possible
  • Add physical NICs to increase bandwidth
  • Enable Netqueue: this takes advantage of the ability of some network adapters to deliver network traffic to the system in multiple receive queues that can be processed separately, allowing processing to be scaled to multiple CPUs, improving receive-side networking performance.
  • Consider DirectPath I/O
  • Consider use of jumbo frames

To monitor network optimisations:

esxtop (press ‘n’ to get network statistics);

  • %DRPTX – should be 0
  • %DRPRX – should be 0
  • You can also see which VM is using which pNIC in a team, pNIC speed and duplex

vCenter (Performance -> Advanced -> ‘Network’);

  • Network usage average (KB/s). VMs and hosts
  • Dropped rx – should be 0
  • Dropped tx – should be 0

Tune Virtual Machine CPU configurations

Rather than just allocating a number of virtual CPUs to a virtual machine, it’s possible to define how these are split up in terms of virtual sockets and cores. This is generally done due to Guest OS/Application licensing restrictions:

tuneCPU1

 

As shown earlier in this post, it’s possible to hot add vCPUs on supported guest OS’ so long as the option has been enabled whilst the virtual machine is in a powered off state. You should also be aware of the CPU Affinity settings and the Hyperthreading settings:

tuneCPU2

CPU Identification Mask: The CPU Identification Mask (CPUID Mask) setting controls the CPU features that are exposed to the guest OS. Masking features can increase vMotion compatibility as, when carrying out a vMotion the host will compare the CPU features available to the guest operating system with the features available on the destination host.

CPU identification mask settings are available under the Options tab, in the virtual machine’s settings:

 

tuneCPU3

CPU/MMU Virtualisation: An ESXi host will automatically choose whether a virtual machine should use hardware support for virtualisation based on the host’s processor type and the guest OS installed on the virtual machine. However, the automatic setting can be overridden.

As stated in the Performance Best Practices guide, on processors that support hardware-assisted CPU virtualisation but not hardware-assisted MMU virtualisation, the monitor mode that ESXi chooses will either be binary translation (BT) with software MMU (swMMU) virtual machine monitor (VMM) mode or the hardware virtualization (HV) with swMMU VMM mode.

When the host has processors that support hardware-assisted MMU virtualisation ESXi will choose from HV with hardware MMU (hwMMU) VMM mode, HV with swMMU VMM mode, and BT with swMMU VMM mode, depending on the guest operating system installed in the virtual machine.

To manually change the mode, if supported, edit  the virtual machine settings, choose the Options tab, and select CPU/MMU Virtualisation:

tuneCPU4

 

The options available are as follows:

  • Automatic allows ESXi to determine the best choice. This is the default setting.
  • Use software for instruction set and MMU virtualization – disables both hardware-assisted CPU virtualization (VT-x/AMD-V) and hardware-assisted MMU virtualization (EPT/RVI).
  • Use Intel® VT-x/AMD-V™ for instruction set virtualization and software for MMU virtualization enables hardware-assisted CPU virtualization (VT-x/AMD-V) but disables hardware-assisted MMU virtualization (EPT/RVI).
  • Use Intel® VT-x/AMD-V™ for instruction set virtualization and Intel® EPT/AMD RVI for MMU virtualization enables both hardware-assisted CPU virtualization (VT-x/AMD-V) and hardware-assisted MMU virtualization (EPT/RVI).

If you want to confirm which mode a virtual machine is using, it is recorded in the vmware.log file. A quick way to check is to search for ‘MONITOR’ in the log file:

tuneCPU5

The filtered list displays the valid modes for the VM, with the left most mode being utilized.

Tune Virtual Machine storage configurations

In reality there’s not that much tuning you can do at the VMware level to improve storage, most tuning needs to be done at the storage array.

Multipathing: select the right policy for your array (check with your vendor);

  • MRU (active passive)
  • Fixed (active/active)
  • Fixed_AP (active/passive and ALUA)
  • RR (active/active, typically with ALUA)

Check multipathing configuration using esxcli and vicfg-mpath. For iSCSI check the software port binding.

Storage alignment: You should always align storage at array, VMFS, and guest OS level.

Storage related queues: Use esxcfg-module to amend LUN (HBA) queue depth (default 32). Syntax varies per vendor. Use esxcfg-advcfg to amend VMkernel queue depth. Should be the same as the LUN queue depth. NOTE: If you adjust the LUN queue you have to adjust on every host in a cluster.

Generic tips for optimising storage performance:

  • Check IOps
  • Check latency
  • Check bandwidth
  • Remember for iSCSI and NAS you may also have to check network performance

Calculate available resources

There are a number of ways to view resource usage and availability in vCenter, depending on whether you are looking at a cluster as a whole, or at individual hosts. To get an overall view of resource utilisation for a cluster, you can use the cluster’s resource distribution chart, which has detail on both CPU and memory utilizaton.

To view the chart, go to the cluster’s summary tab and click ‘View Resource Distribution Chart’:

tune01

 

tune02

You can investigate resource utilisation on individual hosts by using the vSphere client to view performance data, or by using esxtop/resxtop. The host’s summary tab gives an overview of the host’s CPU and memory usage, though you should bear in mind that the memory usage data here is based on ‘consumed’ memory, and not a total of the virtual machines ‘active’ memory:

 

tune03

More in depth information can be seen by viewing the charts available on the host’s Performance tab.

As stated earlier, you can also use esxtop to help you calculate a host’s available CPU and memory resources. For example, pressing ‘c’ lets you view CPU performance information:

tune04

PCPU USED (%)represents the effective work of that particular CPU, allowing you to calculate the available resources per PCPU. To check the current memory utilization in esxtop, press the ‘m’ key:

tune05

PMEM show’s the amount of physical memory in the host, how much is being used by the vmkernel and how much memory is free.

You can also view memory and cpu resource usage details for individual virtual machines in the vSphere client, on the virtual machine’s resource allocation tab:

tune06

 

Properly size a Virtual Machine based on application workload

It’s important to allocate the correct amount of resources to a virtual machine to avoid creating performance issues, and wasted resources. There are tools such as VMware’s Capacity Planner, which can help you analyse a servers workload in order to allocate it the correct amount of resources as a virtual machine. Remember – you can always increase the VMs resource allocation if necessary, so it’s better to start with less rather than more resources if you are unsure exactly what the resource requirements are.

Modify large memory page settings

ESXi 5 allows virtual machines to use large memory pages (Memory pages of 2MB). There can be a performance benefit for Guest OS and applications that are requesting the use of large pages, as it can reduce the load on the host’s CPU. One thing to bear in mind around the use of large memory pages is that the Transparent Page Sharing (TPS) feature does not share large memory pages. It may mean that your host’s physical memory resources may become overcommitted more quickly that it otherwise would have done.

The settings relating to large memory pages can be found in the host’s advanced settings:

tune07

The relevant settings are shown below (taken from the vSphere Resource Management documentation):

Setting Description Value
Mem.AllocGuestLargePage Enables backing of guest large pages with host large pages. Reduces TLB misses and improves performance in server workloads that use guest large pages. 0=disable. 1
Mem.AllocUsePSharePool and

Mem.AllocUseGuestPool

Reduces memory fragmentation by improving the probability of backing guest large pages with host large pages. If host memory is fragmented, the availability of host large pages is reduced. 0 = disable. 15
LPage.LPageDefragEnable Enables large page defragmentation. 0 = disable. 32
LPage.LPageDefragRateVM Maximum number of large page defragmentation attempts per second per virtual machine. Accepted values range from 1 to 1024. 256
LPage.LPageDefragRateTotal Maximum number of large page defragmentation attempts per second. Accepted values range from 1 to 10240. 256
LPage.LPageAlwaysTryForNPT Try to allocate large pages for nested page tables (called ‘RVI’ by AMD or ‘EPT’ by Intel). If you enable this option, all guest memory is backed with large pages in machines that use nested page tables (for example, AMD Barcelona). If NPT is not available, only some portion of guest memory is backed with large pages. 0= disable.

Understand appropriate use cases for CPU affinity

CPU affinity allows you to choose which physical processor(s) you want a virtual machine to run on. The option is only available for virtual machines that are not in a DRS cluster, or for those that have DRS set to manual.

The setting can be found in a virtual machine’s settings, on the resource tab, under Advanced CPU.

 

There are a number of issues to be aware of when using CPU affinity. For example, it can interfere with the host’s ability to meet reservations and shares specified for the virtual machine. Other issues are:

  • For multiprocessor systems, ESX/ESXi systems perform automatic load balancing. Avoid manual specification of virtual machine affinity to improve the scheduler’s ability to balance load across processors.
  • Affinity can interfere with the ESX/ESXi host’s ability to meet the reservation and shares specified for a virtual machine.
  • Because CPU admission control does not consider affinity, a virtual machine with manual affinity settings might not always receive its full reservation. Virtual machines that do not have manual affinity settings are not adversely affected by virtual machines with manual affinity settings.
  • When you move a virtual machine from one host to another, affinity might no longer apply because the new host might have a different number of processors.
  • The NUMA scheduler might not be able to manage a virtual machine that is already assigned to certain processors using affinity.
  • Affinity can affect an ESX/ESXi host’s ability to schedule virtual machines on multicore or hyperthreaded processors to take full advantage of resources shared on such processors.

Configure alternate virtual machine swap locations

When a virtual machine is powered on, the host creates a swap file for the virtual machine. The size of this swap file is equal to the difference between the virtual machine’s configured memory and the memory reservation for the VM (if one is set).

By default, virtual machine swap files are created in the virtual machine’s working directory (the same location as the .vmx file). However, it is possible to change the default location. The first thing to do is to configure the cluster swap file setting by editing the cluster:

tune08

 

After updating the cluster settings you need to configure the host setting. This is found under the Configuration tab for the host. Click on ‘Virtual Machine Swapfile Location’:

tune09

 

Again, by default the swap files are created in the virtual machines working directory. To configure a specific swapfile location click ‘Edit’:

tune10

 

After choosing the datastore to store swap files on, click ok. This setting is specific to the host, so you will need to make the change on all hosts in the cluster.

Swap file location can be overridden on a per virtual machine basis by setting the option in the virtual machine’s settings:

tune11