This post is going to be one of my favorite posts this year because i have been asked by lot of my readers to write about the ESXi host memory management techniques. Fairly all VMware Administrators will be aware about the ESX memory management techniques to handle the over commitment of the memory. Over commitment means nothing but you can allocated more memory to your virtual machines than the actual available memory of the ESX host. Let's assume,your ESXi host having total memory of 40 GB but you have 10 virtual machines and each configured with 4 GB with the total of 40 GB memory. But actually total memory available at your ESXi host is just 30 GB. This over commitment can be achieved with the help of memory management techniques and also not all the VM's will be utilizing 100% of memory allocated at all the times. If it happens, ESXi host will actively use its memory reclamation techniques to handle this situation efficiently.
Low - If the host's memory usage is above the Low state, ESXi host will stop creating the new pages for Virtual machines and continues compressing and swapping until free up more memory. With prior to the vSphere 5, High was set by default at 6%, Soft at 4%, Hard at 2%, and Low at 1%. Pretty much unless the server has a architecture that requires matched RAM in certain banks (ie a HP DL585G1 has to match configuration across the board) You always want to put the host in maintenance mode before shutting down. Biggest thing to remember is that if you are using the free version of EXSi 5 you are limited to 32GB. VMware ESXi: The Purpose-Built Bare Metal Hypervisor. Discover a robust, bare-metal hypervisor that installs directly onto your physical server. With direct access to and control of underlying resources, VMware ESXi effectively partitions hardware to consolidate applications and cut costs. Corsair USB3 memory stick. I picked the Asrock B450M Pro4-F for its extensibility - two m.2 slots, four memory slots and an extra x4 slot. Didn't care much about networking since I had a dual port server NIC on hand. Worked out beautifully. Boots ESXi from USB3 headless (no monitor connected). The second m.2 is SATA only.
Below are the Memory Management techniques available as part of ESXi host
1. Transparent Page Sharing
2. Memory ballooning
3. Memory Compression
4. Hypervisor-level memory swapping
Esxi Host Memory Slots Online
Most of the VMware administrators will be aware about these management techniques but do you know when and at what stage which technique will be used. Let's discuss in detail about the various ESXi host Memory states. The ESXi host never attempts to reclaim the memory using memory management techniques until it is under memory contention. Memory reclamation techniques such as ( Memory ballooning, Compression or swapping) will come into action based on the amount of ESXi free host memory. There are 4 different ESXi host states
1. High
2. Soft
3. Hard
4. Low
High -> By default Transparent Page sharing will be always running
Soft -> Memory ballooning will be activate,when ESXi enters the soft state and remains active until ESXi is back to high state
Hard & Low -> Memory compression and hypervisor-level memory swapping are used by ESXi when ESX is in the hard or low state
Low -> If the host's memory usage is above the Low state, ESXi host will stop creating the new pages for Virtual machines and continues compressing and swapping until free up more memory.
With prior to the vSphere 5, High was set by default at 6%, Soft at 4%, Hard at 2%, and Low at 1%. If the ESXi host free memory is less than the mentioned percentage , ESXi uses the respective memory-reclamation techniques to reclaim the memory but think about the Host configured with more memory. It is not necessary to protect that much free memory. Let's take a example, ESXi 5.0 host can run with 2 TB of memory. Using these defined values as in the pre-vsphere 5.0, Host will start to reclaim the memory even if has 100 GB of free memory. This is really not a great option. So with vSphere 5.x, This predefined values has been changed to effectively handle the host reclamation techniques for the ESXi host configured with more memory.
With vSphere 5, High state level will be adjusted according to the amount of memory in the host. Below is the calculation
High -> 900 MB for 28 GB + 1 % of all memory above 28 GB (If host is having more than 28 GB of memory)
Soft -> 2/3 of High (64% of High)
Hard -> 1/3 of High (32 % of High)
Low -> 1/6 of High (16 % of High)
Below are few of the example of Memory Reclamation Levels of ESXi host
Verify the Current ESXi Host Memory State
We have understood the ESXi Memory state and How memory reclamation techniques will be used based on the ESXi host memory States. Now, Let's learn How to verify the current memory state of the ESXi host.
Login to your ESXi host using SSH and type the below command
esxtop and Press m
You can see the current memory state of the ESXi host as like the below screenshots.
That's it. We will take a look at the various memory reclamation techniques in upcoming posts. I hope this is informative for you. Thanks for Reading. Be Social and share it in social media, if you feel worth sharing it.
Problem Statement
What is the most suitable hardware specifications for this environments ESXi hosts?
Requirements
1. Support Virtual Machines of up to 16 vCPUs and 256GB RAM
2. Achieve up to 400% CPU overcommitment
3. Achieve up to 150% RAM overcommitment
4. Ensure cluster performance is both consistent & maximized
5. Support IP based storage (NFS & iSCSI)
6. The average VM size is 1vCPU / 4GB RAM
7. Cluster must support approx 1000 average size Virtual machines day 1
8. The solution should be scalable beyond 1000 VMs (Future-Proofing)
9. N+2 redundancy
Assumptions
1. vSphere 5.0 or later
2. vSphere Enterprise Plus licensing (to support Network I/O Control)
3. VMs range from Business Critical Application (BCAs) to non critical servers
4. Software licensing for applications being hosted in the environment are based on per vCPU OR per host where DRS 'Must' rules can be used to isolate VMs to licensed ESXi hosts
Constraints
1. None
Motivation
1. Create a Scalable solution
2. Ensure high performance
3. Minimize HA overhead
4. Maximize flexibility
Architectural Decision
Use Two Socket Servers w/ >= 8 cores per socket with HT support (16 physical cores / 32 logical cores) , 256GB Ram , 2 x 10GB NICs
Justification
1. Two socket 8 core (or greater) CPUs with Hyper threading will provide flexibility for CPU scheduling of large numbers of diverse (vCPU sized) VMs to minimize CPU Ready (contention)
Esxi Host Memory
2. Using Two Socket servers of the proposed specification will support the required 1000 average sized VMs with 18 hosts with 11% reserved for HA to meet the required N+2 redundancy.
3. A cluster size of 18 hosts will deliver excellent cluster (DRS) efficiency / flexibility with minimal overhead for HA (Only 11%) thus ensuring cluster performance is both consistent & maximized.
4. The cluster can be expanded with up to 14 more hosts (to the 32 host cluster limit) in the event the average VM size is greater than anticipated or the customer experiences growth
5. Having 2 x 10GB connections should comfortably support the IP Storage / vMotion / FT and network data with minimal possibility of contention. In the event of contention Network I/O Control will be configured to minimize any impact (see Example VMware vNetworking Design w/ 2 x 10GB NICs)
6. RAM is one of the most common bottlenecks in a virtual environment, with 16 physical cores and 256GB RAM this equates to 16GB of RAM per physical core. For the average sized VM (1vCPU / 4GB RAM) this meets the CPU overcommitment target (up to 400%) with no RAM overcommitment to minimize the chance of RAM becoming the bottleneck
7. In the event of a host failure, the number of Virtual machines impacted will be up to 64 (based on the assumed average size VM) which is minimal when compared to a Four Socket ESXi host which would see 128 VMs impacted by a single host outage
8. If using Four socket ESXi hosts the cluster size would be approx 10 hosts and would require 20% of cluster resources would have to be reserved for HA to meet the N+2 redundancy requirement. This cluster size is less efficient from a DRS perspective and the HA overhead would equate to higher CapEx and as a result lower the ROI
9. The solution supports Virtual machines of up to 16 vCPUs and 256GB RAM although this size VM would be discouraged in favour of a scale out approach (where possible)
10. The cluster aligns with a virtualization friendly 'Scale out' methodology
11. Using smaller hosts (either single socket, or less cores per socket) would not meet the requirement to support supports Virtual machines of up to 16 vCPUs and 256GB RAM , would likely require multiple clusters and require additional 10GB and 1GB cabling as compared to the Two Socket configuration
12. The two socket configuration allows the cluster to be scaled (expanded) at a very granular level (if required) to reduce CapEx expenditure and minimize waste/unused cluster capacity by adding larger hosts
13. Enabling features such as Distributed Power Management (DPM) are more attractive and lower risk for larger clusters and may result in lower environmental costs (ie: Power / Cooling)
Alternatives
1. Use Four Socket Servers w/ >= 8 cores per socket , 512GB Ram , 4 x 10GB NICs
2. Use Single Socket Servers w/ >= 8 cores , 128GB Ram , 2 x 10GB NICs
3. Use Two Socket Servers w/ >= 8 cores , 512GB Ram , 2 x 10GB NICs
4. Use Two Socket Servers w/ >= 8 cores , 384GB Ram , 2 x 10GB NICs
5. Have two clusters of 9 hosts with the recommended hardware specifications
Implications
1. Additional IP addresses for ESXi Management, vMotion, FT & Out of band management will be required as compared to a solution using larger hosts
2. Additional out of band management cabling will be required as compared to a solution using larger hosts
Related Articles
1. Example Architectural Decision – Network I/O Control for ESXi Host using IP Storage (4 x 10 GB NICs)
2. Example VMware vNetworking Design w/ 2 x 10GB NICs
3. Network I/O Control Shares/Limits for ESXi Host using IP Storage
4. VMware Clusters – Scale up for Scale out?
5. Jumbo Frames for IP Storage (Do not use Jumbo Frames)
6. Jumbo Frames for IP Storage (Use Jumbo Frames)