Support for non-productive SAP HANA systems on VMware vSphere 5.1 has been announced in November 2012. Since April 2014, also productive SAP HANA systems can be virtualized by customers on VMware vSphere 5.5. Currently, some restrictions apply which are preventing SAP HANA to be treated like any other SAP Application running on VMware vSphere. But because the conditions for virtualized HANA will be harmonized in the future in order to build a homogenous SAP technology platform, it is recommended to continously keep the SAP documentation up-to-date and always refer to the latest version only. Because HANA virtualization support advances very fast, we have to work with a lot of references in this document.
The audience of this wiki should have a basic understanding of following components:
Following combination has been tested and verified as working:
- Cisco UCS Manager 2.1
- VMware ESXi 5.5
- VMware vCenter Server 5.5
- SUSE Linux Enterprise Server 11 SP3
- Red Hat Enterprise Linux Server 6.5
- SAP HANA 1.0 SPS07
It is strongly advised against using lower versions, while newer versions are considered to work as expected as long as the combination is reflected in the Cisco and VMware compatibility guides and the Certified and Supported SAP HANA Hardware Directory.
>> The current support situation can be found at the SAP HANA on VMware vSphere wiki page. <<
Although one of the goals of this wiki is to consolidate information about the virtualized HANA deployment process, there are still a lot of necessary references to other documentation. It is important to know these documents very well and to consult them during planning and implementation.
Whenever there is a reference in this wiki which does not contain a link, the corresponding document can be found in this reference section.
SAP HANA TDI References
|Document ID / URL||Description|
|Overview||SAP HANA TDI Overview|
|FAQ||SAP HANA TDI FAQ|
|Storage Requirements||SAP HANA TDI Storage Requirements|
|Network Requirements||SAP HANA TDI Network Requirements|
|PAM||Certified SAP HANA Hardware Directory|
Document ID / URL
General Support Statement for Virtual Environments
VMware vSphere configuration guidelines
VMware vSphere host monitoring interface
SAP HANA Support for virtualized / partitioned (multi-tenant) environments
SAP HANA on VMware vSphere in production
Multiple SAP HANA VMs on VMware vSphere in production
SAP Business Warehouse, powered by SAP HANA on VMware vSphere in scale-out and production
SAP HANA on VMware vSphere 6 in production
|SAP Note 2393917||Single SAP HANA VM on VMware vSphere 6.5 in production|
|SAP HANA Guidelines from SAP||SAP HANA Guidelines for running virtualized|
|SAP Virtualization Best Practices||Overall SAP Virtualization Best Practices from VMware|
|HANA Scale-Up Best Practices||Best Practices and Recommendations for Scale-Up Deployments of SAP HANA on VMware vSphere|
|HANA Scale-Out Best Practices||Best Practices and Recommendations for Scale-Out Deployments of SAP HANA on VMware vSphere|
Document ID / URL
SAP Software on Linux: General information
SAP HANA: Supported Operating Systems
SAP HANA Guidelines for SLES Operating System Installation
|SAP Note 2205917||SAP HANA DB: Recommended OS settings for SLES 12|
SAP HANA Guidelines for RHEL Operating System Installation
|SAP Note 2292690||SAP HANA DB: Recommended OS settings for RHEL 7.2|
|Document ID / URL||Description|
|SAP HANA TDI on Cisco UCS||Cisco UCS Configuration Principles for SAP HANA with Shared Storage and Shared Network|
|SAP HANA on Xeon E5||Cisco UCS Server with Intel Xeon Processor E5 - Designing SAP Entry-Level Systems|
|Data Center Solutions for SAP||Case Studies, Solution Briefs and Videos about SAP on Cisco UCS|
|Design Zone for SAP Applications||Cisco Validated Designs (CVD) for SAP HANA and SAP Applications|
|Solution White Papers||Cisco UCS White Papers for Application Solutions|
|Product White Papers||Cisco UCS White Papers and Technical Documents|
UCS Service Profile
The Cisco UCS service profile contains the configuration of the hardware in a Cisco UCS environment, whereas the ESXi server OS provides the hypervisor where virtual machines are running on. UCS Service Profile Templates or vSphere Auto Deploy can be used to ease the ESXi deployment process. In this example, a standalone service profile creation is shown.
In order to get the best performance for virtualization, certain BIOS features should be enabled. The c-states can be controlled by the hypervisor and do not necessarily have to be disabled. It depends on the performance needs vs. power saving aspects how balanced this configuration should be.
In the configuration section "Power Management" of the ESXi host, you can choose different power saving options. This is always a trade-off between performance vs. low power consumption.
VMware vSphere screenshot
It is recommended to enable all Intel Direct IO features. This basically gives virtual machines direct access to underlying hardware for improved performance, especially on network interfaces.
It is beneficial for HANA when a VM with 2 vSockets get scheduled on the same physical socket rather than being spread out to remote sockets. While distributing workload of the same VM on multiple sockets to get the full computing power of the core might be beneficial for other workloads, HANA benefits largely from CPU caching effects (see SAP Note 2024433) and therefore the VM should stick to its sockets as much as possible. Using sched.vcpu.affinity per VM is a quite harsh way to configure NUMA locality and also limits VM operational tasks, so let's look at another more gentle approach with quite the same benefits.
VMware KB 2003582 describes the two parameters numa.vcpu.preferHT and numa.preferHT, one for a single VM, the other for the entire ESXi host. Using one of these parameters solves the problem of vCPUs getting scheduled to distant NUMA nodes and therefore leverages CPU caching effects for the VM.
NUMA Node Sharing
Since the multi-VM support for vSphere 6.0 has been announced in SAP Note 2315348, it is also allowed to run two HANA VMs on the same physical socket. For OLAP workloads, there's a recommendation to calculate with 15 % CPU overhead. While this might spoil the sizing according to the fix CPU : Memory ratio, consider that on an ESXi host there's some amount of memory unusable for the VMs anyway. On a server that is equipped with 512 GB RAM for each socket, the two VMs sharing this socket can have 218 GB RAM each while they still should be configured with 24 vCPUs each, not 20. The only effect you'd achieve by configuring two VMs with 20 vCPUs is restricting the compute power of the VMs, as the CPU scheduler is able to efficiently handle overlapping CPU resources of VMs and the hypervisor.
Cisco, VMware and SAP Network Terminology
SAP HANA has different types of network communication channels to support the different SAP HANA scenarios and setups.
Source: SAP SE
The following table is a mix of networks derived from the SAP HANA TDI Network Requirements document and the potential networks in a virtualization environment:
Client Zone Networks
Application server network
Communication between SAP application server and database
1 or 10 Gigabit Ethernet
Communication between user or client application and database
1 or 10 Gigabit Ethernet
Data source network
Data import and external data integration
Optional for all SAP HANA systems
1 or 10 Gigabit Ethernet
Internal Zone Networks
Node-to-node communication within a scale-out configuration
10 Gigabit Ethernet
System replication network
SAP HANA system replication
HANA System Replication (HSR) only
To be determined with customer
Storage Zone Networks
Optional for all SAP HANA systems
10 Gigabit Ethernet or 8-Gbps Fibre Channel
Communication between nodes and storage
External TDI storage system
10 Gigabit Ethernet or 8-Gbps Fibre Channel
Infrastructure and SAP HANA administration
Optional for all SAP HANA systems
1 Gigabit Ethernet
OS boot using Preboot Execution Environment (PXE) and Network File System (NFS) or Fibre Channel over Ethernet (FCoE)
Optional for all SAP HANA systems
1 Gigabit Ethernet
|vMotion||Communication between ESXi hosts for live-migration of VMs (manually via vMotion or dynamically via DRS)||Optional for vMotion-enabled scenarios|
(multiple*) 10 Gigabit Ethernet
*multi-NIC vMotion might be considered
When it comes to naming convention of physical and virtual network devices, especially the term vNIC, Cisco and VMware unfortunately have some overlap:
|UCS||Unified Computing System||Cisco||Cisco's x86 server and datacenter product line.|
|-||Unified Fabric||Cisco||A UCS domain consists of two switching fabrics, A and B, for redundancy reasons. It is called Unified Fabric because it can handle several protocols on the same physical network port, like Ethernet, Fibre Channel (FC) and Fibre Channel over Ethernet (FCoE).|
|FI||Fabric Interconnect||Cisco||Each fabric is physically represented by one FI, and both FIs are running in a HA cluster. This device is basically a layer-2 switch with integrated UCSM.|
|UCSM||UCS Manager||Cisco||The UCSM is the administrative software for Cisco server configuration.|
|IOM||Input/Output Module||Cisco||Every Cisco UCS blade chassis is connected to both fabrics. One IOM to fabric A, the other IOM to fabric B.|
|NIC||Network interface card||common||The actual physical representation of network resources in a server, eg: a network interface card with 2 x 10 Gigabit Ethernet ports.|
|pNIC||Physical network interface card|
The term NIC would be ambiguous in a Cisco environment, so Cisco is using pNIC when speaking of actual hardware. Cisco offers basically two types of pNICs: Cisco VICs and 3rd party NICs (Intel, QLogic, Emulex).
VMware's understanding of a pNIC (or: Physical Network Adapter) is whatever the hardware layer presents as network interface to the hypervisor.
Attention: On Cisco servers, the VIC is transparent to the hypervisor!
|VIC||Virtual interface card||Cisco||Cisco's family of pNICs with advanced capability, especially hardware virtualization.|
|vNIC||Virtual network interface card||Cisco||vNIC is the logical representation of network resources that can be consumed by the hypervisor. All configurable aspects of a network connection (MTU size, VLAN, bandwidth, QoS, physical buffer sizes, interupt throttling, etc.) are done in the vNIC definition. So the vNIC is the entity that shows up as a NIC in the hypervisor. Therefore, a Cisco vNIC is from a hypervisor point of view a pNIC.|
|VMware||If someone talks about a vNIC in a VMware context, they most likely refer to the NIC inside the VM. There is no distinct term for a NIC inside a VM, it is often referred to as Virtual Network Adapter or, in certain context, simply Network Adapter.|
VMware has two types of vSwitches: Standard vSwitch and vSphere Distributed Switch (or: VDS, less common: DVS, DvSwitch). The difference between the two are more on administrative side: Standard vSwitches can be configured on standalone ESXi hosts, whereas the use of vSphere Distributed Switches require a vCenter Server and unifies network configuration on multiple hosts. Some parameters (like MTU size) have to be configured on vSwitch level, while other parameters are defined per virtual port group.
|-||Virtual Port Group||VMware||A virtual port group (or: virtual machine port group, port group) defines the configuration of a group of virtual network ports where virtual machines can connect their virtual network adapters to. Most importantly, the VLAN configuration is done on virtual port group level.|
|vmknic||VM kernel NIC||VMware||The vmknic is a network adapter of the ESXi host. It is used to give the ESXi host one or more IP addresses for management traffic, vMotion, Fault Tolerance, vSphere HA and so on. The vmknic connects to a vSwitch but is not part of a virtual port group.|
|vmnic||Physical NIC / uplink||VMware||Like already mentioned, the hypervisor sees the Cisco vNICs as physical network interfaces. The ESX's naming convention for physical NICs is "vmnic" followed by a sequential number. The vmnics are the uplinks for vSwitches. Every vSwitch needs of course at least one uplink to communicate to the external network (an internal vSwitch with no uplinks is also possible, though).|
General Network Recommendations
For the internal zone, storage zone and virtualization-related networks, it is recommended to configure jumbo frames end-to-end.
There is technically no difference between standard vSwitches and distributed vSwitches. The only technical difference would be if you intend to use LACP (port channel), because this configuration is only possible with distributed vSwitches. Be aware that LACP on ESXi hosts is not possible on UCS B-series and managed C-series.
On an ESXi host, it is recommended to configure all vNICs with MTU 9000 and trunk mode (respectively allowing all necessary VLANs on it). The actual MTU size and VLAN of the specific network interface, whether it is on the ESXi host level or inside a VM, is defined later on the virtualization layer itself.
Regarding the network adapter placement, the devices might show up in an unexpected order. In order to align vNIC and vmnic enumeration, apply vNIC Placement Policy according to the UCS Manager Server Management Guide and follow VMware KB 2091560 and KB 2019871.
UCS B-Series and Managed C-Series
On UCS B-series and managed C-series, the virtual port groups can be consoliated on one vSwitch with two active uplinks, one on each fabric. This simple configuration fits for many environments.
But if you run VMs that communicate heavily with each other (eg. HANA scale-out VMs or HSR VMs), it is beneficial to keep this inter-VM communication within the same fabric. To keep this communication on the local fabric, your configuration should look like this:
- One vSwitch with one active uplink on fabric A and one passive uplink on fabric B
- This vSwitch contains a part of the virtual port groups of VMs that heavily communicate with each other
- One vSwitch with one active uplink on fabric B and one passive uplink on fabric A
- This vSwitch contains another part of the virtual port groups of VMs that heavily communicate with each other
- One vSwitch with two active uplinks on both fabrics
- This vSwitch contains virtual port groups of VMs with no heavy inter-VM traffic and mostly communicate with destinations outside the UCS fabric
All the vNICs should be configured without fabric failover. During NIC or fabric failure scenarios, with fabric failover enabled on the vNIC, there would be less visibility on ESXi level which uplink uses which fabric.
UCS C-Series Standalone
On UCS C-series standalone, things are a little different. If you use Cisco VIC cards, you still have the advantage of the hardware-virtualized vNICs. Take a look at this example configuration for a 4-socket server hosting several productive and non-productive VMs:
Before going into storage hardware and virtual disk configuration, let me just make a remark: don't underestimate the importance of a proper storage configuration in a physical as well as in a virtual environment! A lot of support cases have turned out to be storage-related. While errors in VM size and configuration can be corrected easily, a bad storage design is hard to roll back. Luckily, in a VMware environment, you have the possibility to do Storage vMotion, which might help if the underlying hardware needs to undergo a redesign.
Ultimately, the storage vendor should be involved as well. There are plenty of best practices out there from every vendor who offers SAP HANA TDI certified solutions.
The storage system must be certified as SAP HANA TDI Storage.
It is recommended to physically separate the the origin (VMFS LUN or NFS export) of the datastores providing data and log for performance reasons. On today's enterprise storage systems, this recommendation might be obsolete because of advanced processing and caching. For SAP HANA environments, these performance classes can be distinguished:
OS boot disk
Virtual Disk Configuration
To fully utilize the virtual resources, a disk distribution is recommended where the disks are connected to different virtual SCSI controllers. This improves parallel I/O processing inside the guest OS.
Consult the SAP HANA TDI - Storage Requirements to get the disk sizes right. The size of the disks can initially be chosen lower than for the HANA Appliance models. This is based on the capability to increase the virtual disk size online.
Size (old measure)
Size (new combined measure)
1 x vRAM
1 x vRAM
3 x vRAM
1.5 x vRAM**
1 x vRAM
vRAM < 512 GB: 0.5 x vRAM
vRAM ≥ 512 GB: min. 512 GB
** The official recommendation is: 1.2 * net disk space for data (use Quick Sizer HANA Version to determine)
A realistic storage quality classification as well as a thorough distribution of the disks among datastores on ESXi host level and virtual SCSI adapters on VM level ensures good disk I/O performance for SAP HANA workloads. You can distribute the VMDKs on differently tiered storage classes for performance reasons. In the end, when you bring together physical storage design and virtual disk layout, your I/O layer might look like this:
CPU and Memory Sizing
This document is not a sizing instrument but delivers a guideline of technically possible configurations. For proper sizing, refer to BW on HANA (BWoH) and Suite on HANA (SoH) sizing documentation. Please be aware that the hardware partner doesn't "do sizing"! The sizing input has always be delivered by the application owner, and after thoroughly working with sizing tools like Quick Sizer, you get a sizing output. This output from your sizing exercise will be translated into hardware by the hardware vendor.
Please be aware that Intel Xeon Scalable Processors (Skylake) are not yet certified for HANA on VMware! Please verify the latest support statement by visiting the SAP HANA on VMware vSphere wiki page. In the later examples, we give only a preview for Skylake processors, but there's no support for productive setups on this hardware platform, yet!
There are also changes in the sizing process since SAP HANA TDI Phase 5. We will soon update this page to incorporate these changes.
Basically, HANA hardware sizing is quite simple because of the fixed CPU : Memory ratio, which looks like this for Skylake and the most recent HANA version:
- BWoH: 0.75 TB per socket
- SoH: 1.5 TB per socket
It is the variety of processors and HANA revisions that make things complicated. This matrix just does the math for you:
It is recommended to configure the cores per socket according to the actual hardware configuration, which means:
- 10 cores per socket on HANA certified Westmere-EX processors
- 15 cores per socket on HANA certified Ivy Bridge-EX processors
- 18 cores per socket on HANA certified Haswell-EX processors
- 22 cores per socket on HANA certified Broadwell-EX processor E7-8880v4
- 24 cores per socket on HANA certified Broadwell-EX processor E7-8890v4
- 28 cores per socket on HANA certified Skylake processors 8180(M) and 8176(M)
The total amount of vCPUs will then be determined by setting the appropriate amount of sockets. The cores per socket should not be changed and always reflects the underlying hardware. If you have smaller VMs sharing a socket (which is also supported for up to two productive VMs under vSphere 6.0), then there's often the attempt to size them in a way that the total amount of vCPUs running on a socket equals the total amount of threads on that socket, resulting in quite small and uneven vCPU assignments for each VM. This is not a problem per se as long as it complies with the CPU : memory ratio, but in this case I'd rather recommend to over-commit CPU resources than cutting them short in order to follow the strict no-over-commitment rule. Memory over-commitment is a serious problem and should never been done for HANA VMs, that's a fact. But CPU over-commitment can easily be handled by the hypervisor with negligible penalty for most of the NUMA node sharing scenarios. If you fear that some VMs might steal CPU cycles from other VMs, apply proper vSphere Resource Management policies like CPU Shares. If there's still some concern about over-committing CPU resources, think about this:
- With CPU over-commitment, VMs might run slower than usual in an overall high-load situation because the total amount of physical CPU resources is not able to satisfy the demand of all VMs at once. The only disadvatage would be unpredictable drop of performance on affected VMs, which can be mitigated through proper resource management.
- With a tight vCPU assignment on the VMs which does not over-commit CPU resources, all VMs will always run slower than they potentially could because they cannot fully utilize the physical CPU resources. Only in high-load situations, physical CPU resources might get fully utilized. The only advantage would be more predictable performance.
What you also should consider is that the ESXi host and every virtual machine produce some memory overhead. For example, an ESXi server with 512 GB physical RAM can not host a virtual machine with 512 GB vRAM because the server would need some static memory space for its kernel and some dynamic memory space for each virtual machine. A virtual machine with eg. 500 GB vRAM would most likely fit into a 512 GB ESXi host.
There's also a blog on sizing a virtual machine for HANA.
Sizing E5 Entry Level Systems
Source: TDI Overview, April 2016, SAP SE
The sizing for E5 systems works different. There's no specific CPU : RAM ratio nor a specific E5 CPU model that is supported, therefore there can be no sizing guideline that compares to E7 systems. The E5 systems were meant to be a cost-optimized alternative to the regular E7 systems - that's why it is called "Entry Level System" in the context of HANA. Usually they are used for non-production (see SAP Note 2271345), but also productive systems with E5 are possible following these standards:
8 cores per socket minimum
up to 1.5 TB
homogenous symmetric assembly of DIMMs
maximum utilization of all available memory channels
local or remote
fulfill TDI storage requirements
pass KPI test
This gives a lot of freedom for customers to build their Entry Level Systems. The CPU model can range from 8 to 22 cores per socket and from 1.6 to 3.2 GHz per core, with a maximum RAM size of 1.5 TB. Defining a fixed vCPU : vRAM ratio for such a variety of processors is simply not possible. Additionally, if multiple VMs with production systems should run on one host, the restriction from SAP Note 2024433 also applies for Entry Level Systems, which limit the amount of allowed VMs to the number of two:
The vCPUs of a single VM must be pinned to physical cores, so the CPU cores of a socket get exclusively used by only one single VM. A single VM may span more than one socket, however. If a VM running SAP HANA as workload uses physical cores of a certain socket there shall not be any other VM (used for SAP HANA or any other workload) making use of any of the physical cores of this same socket. CPU and Memory overcommitting must not be used.
With all this freedom, how to configure productive VMs on E5, then? One thing we can derive from SAP's guidelines is a ratio of 1 socket : 768 GB RAM, but still this doesn't say anything about the actual compute power that is behind this number. If you use E5 2699 v4, then it's a totally different performance than with E5 2667 v4. It's not more or less performance, it's different performance: You would get more cores at the cost of less clockspeed. HANA usually prefers cores over clockspeed, but there's a reason why in the E7 systems only the top-notch CPU models are chosen: to get a lot of cores and still have decent clockspeed. So when you run a 1.5 TB HANA system with low-end E5 CPUs (8 cores with 1.7 GHz for example), there's a good chance that the user experience might suffer. Good E5 CPUs come also with a price, see the comparison between an E7 CPU supported for HANA and the E5 CPU with the highest accumulated throughput:
|E5 2699 v4||22||2.2 GHz||55 MB||4,115 $|
|E7 8880 v4||22||2.2 GHz||55 MB||5,895 $|
Of course, an E7 CPU has a greater memory throughput and because of its enhanced capabilities it is integrated in systems with greater reliability, availability, and serviceability. But the whole point about E5 systems is to save money, and using expensive CPUs in low-cost servers mitigate the whole purpose.
In the end, there's no answer to the question of how to size HANA VMs on an E5 server, because there's no definite guideline on how to size HANA on E5 servers at all. It's all about optimizing price : performance ratio, nothing else. If you want adequate systems for your HANA database, virtualized or not, use E7. If you want to save money, use E5, and depending on how far you want to go dumping the price, use a corresponding processor.
For more information on sizing SAP HANA Entry-Level Systems, refer to the Cisco white paper in the References section.
Virtual Machine Creation
I removed the former screenshots of virtual machine creation as I think this is basic VMware knowledge. Instead, let me just highlight some key takeaways from the past years:
- Configure four SCSI controllers with controller type VMware Paravirtual (pvscsi).
- Creating VMDKs with "eager zeroed thick" improves initial write performance.
- Use VMXNET3 as network adapter.
- Be careful when changing Latency Sensitivity setting, as this has some side effects (see SAP Note 2015392).
As already mentioned in the sizing section, the assignment of CPU resources is a trade-off between predictable performance and maximum utilization of the hardware. For productive HANA systems with high utilization patterns, it is still recommended to reserve CPU and memory. While the CPU reservation can vary, and in some scenarios CPU over-commitment is ok, the memory reservation has to be set to 100 %. This also has the positive side effect that the size of the vm swap file is 0 bytes.
When deciding to not over-commit CPU resources, you will achieve the highest and most predictable performance. In this case, you can change the behavior of the CPU scheduler to reduce scheduling overhead even more by setting two parameters. Be aware that this sacrifices any possible consolidation effects on your host:
halt_in_monitor = "TRUE"
idleLoopSpinBeforeHalt = "TRUE"
Apply SAP Note 1606643 - Linux: VMware vSphere host monitoring interface. This gives the SAP software insight into hypervisor performance metrics.
Guest Operating System
I'll not maintain OS installation instructions in this wiki. The installation of the Linux OS has to be done according to the SAP Notes for HANA systems on SLES or RHEL, see Reference section.
With regards to the network configuration, it is not recommended to configure bond devices inside the Linux guest OS. Such a configuration is used in native environments to guarantee availability of the network adapters. In virtual environments, the redundant uplinks of the vSwitch take on that role.
In /etc/sysctl.conf some tuning might be necessary for scale-out and in-guest NFS scenarios:
net.ipv4.tcp_slow_start_after_idle = 0
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.optmem_max = 16777216
net.core.netdev_max_backlog = 300000
net.ipv4.tcp_rmem = 65536 262144 16777216
net.ipv4.tcp_wmem = 65536 262144 16777216
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1
sunrpc.tcp_slot_table_entries = 128
According to VMware KB 2053145, add the following two parameters to the kernel command line:
Apply SAP Note 2161991 - VMware vSphere configuration guideline
Apply OS specific settings according to SAP Note 2235581 - SAP HANA: Supported Operating Systems
To validate the solution, the same hardware configuration check tool as for the appliances is used but with slightly different KPIs. These tests do not have to be executed on already certified SAP HANA TDI Storage, but it is a good information about the disk I/O performance of your setup.
Test File Size
Initial Write (MB/s)
Source: SAP SE, Version 1.9