vNUMA Improvement in vSphere 6

NUMA is always a very interesting topic when in design and operation in virtualization space.  We need to understand it so we can size a proper VM more effectively and efficiently for application to perform at its optimum.

To understand what is NUMA and how it works, a very good article to read will be from here.  Mathias has explained this in a very simple terms with good pictures that I do not have to reinvent.  How I wish I have this article back then.

Starting from ESX 3.5, NUMA was made aware to ESX servers.  Allowing for memory locality via a NUMA node concept.  This helps address memory locality performance.

In vSphere 4.1, wide-VM was introduce this was due to VM been allocating more vCPUs than the physical cores per CPU (larger than a NUMA node).  Check out Frank's post.

In vSphere 5.0, vNUMA was introduced to improve the performance of the CPU scheduling having VM to be exposed to the physical NUMA architecture.  Understanding how this works help to understand why in best practice we try not to placed different make of ESXi servers in the same cluster.  You can read more of it here.

With all these improvement on NUMA helps address memory locality issues.  How memory allocation works when using Memory Hot-Add since Memory Hot-add was not vNUMA aware.

With the release of vSphere 6, there are also improvement in NUMA in terms of memory.  One of which is Memory hot-add is now vNUMA aware.  However many wasn't aware how Memory was previously allocated.

Here I will illustrate with some diagram to help in understanding.

Let's start with what happen in prior with vSphere 6 when a VM is hot-added with memory.

Let's start with a VM with 3 GB of virtual memory configured.

When a additional 3 GB of memory is hot added to VM, memory will be allocated by placing to the first NUMA node follow by the next once memory is insufficient one after another in sequence.

In vSphere 6.0, Hot-Add memory is now more NUMA friendly.

Memory allocation is now balance evenly across all the NUMA nodes instead of all in one basket on the first NUMA node.  This helps in trying to access memory mostly from the lowest NUMA node and thus increase the chance of a local memory access.

We would wish that this could be smarter but of course we cannot predict where memory would be accessed from which NUMA node when a processes is running.

Hope this helps give you a better picture when doing sizing and enabling hot-add function.

Comments

Popular posts from this blog

Why VMware or Why Not after Broadcom?

VMware by Broadcom, A New Chapter Forward

VMware vExpert 2024 Application is Now Open!