Objective 8.5 – Perform Basic Troubleshooting for HA/DRS and VMotion Print E-mail
Written by Matthijs van den Berg   
Wednesday, 20 January 2010 21:51

Knowledge

  • Explain the requirements of HA/DRS and VMotion
    VMotion, the technique used for DRS and in some extend HA have certain requirements to work. I have tried to put most of them in a list:
    • Compatible CPUs
      Depending on your VMotion type (Enhanced VMotion of the previous regular stuff) you need matching CPUs. For regular VMotion these must come from the same family, with enhanced vMotion this requirement is stretched to “vendor”.
    • Advanced CPU Features
      All hosts must have AMD-V or Intel-VT and AMD-NX or Intel XD.
    • A Gigabit network interface
      At least a gigabit NIC is required for vMotion to transferr the state of the VM to another host.
    • Jumbo Frames
      Jumbo Frames are recommended for the best vMotion performance
    • All hosts must be connected to a vCenter server
      The hosts must be part of a vCenter environment and have the correct licences applied.
    • Shared Storage
      All hosts must be able to access shared storage where VMs can reside.
    • VM without RAW disk of physically connected devices
      The VM must not have a RAW device for clustering purposes or any physically devices, like local CD-ROM players from a host or managemnt station, connected.
  • Verify VMotion functionality
    vMotion uses a dedicated interface to transfer data. Usually this interface in designed to use a separate VLAN / subnet. To test if network connectivity, optionally with jumbo frames, is working properly you can use the vmkping command
    vmkping [options] [host|IP address]
    Read more on vmkping ant the available options here. For the ultimate test can can manually vMotion a VM from one host to another.
  • Verify DNS settings
    Adding a host to your company's DNS is essential. Without DNS things like HA will act strange. Though a hostfile can do the trick, DNS usually is more easy to configure and maintain. Make sure that the following in regards to DNS resolving works:
    • Resolve your hostname
    • Resolve you FQDN hostname
    • Resolve your IP address (reversed lookup)
  • Verify the service console network functionality
    You can use vmkping (see above under Verify vMotion Compatibility) to use vmkping to test the SC network connection
  • Interpret the DRS Resource Distribution Graph and Target/Current Host Load Deviation
    The Tab hosts of your Cluster contains the following view:
    drs resource
    This view shows the amount or resource (CPU / Memory) being used on each hosts. When CPU or memory are unbalanced for a longer period of time moving one or more VMs might balance the load en let all servers on those hosts perform better.
  • Troubleshoot VMotion using topology maps
    Topology maps are a easy way to show you the network and storage connection from an to ESX hosts and / or VMs. As stated above there are some requirements to the use of vMotion live storage, networking etc. A first and easy check is to look at the topology maps and see if these requirements are met. Maps can be found when selecting a server and than selecting the tab Maps.
  • Troubleshoot HA capacity issues
    When planning for HA you need to plan for a maximum host failure; the number of hosts that can fail before you run short on resources. When VMs can no longer start this might be due to a lack of resources (memory is quite common). You Vi will provide you with warnings like “insufficient resources to satisfy failover level” etc. Read more here (VI doc but most info is still relevant). http://www.vmware.com/files/pdf/VMwareHA_twp.pdf
  • Troubleshoot HA redundancy issues
    HA redundancy..  What is meant here? The number of host failures allowed? A second service console to counter network issues? If you know, help me out using the comment system please!

Tools

Matthijs’ Links


 

VCP4 Studie Guide - Fast Find