This blog article describes the different SCSI controllers in VMware ESXi and why one eventually is better than another one. I had multiple discussions with customers at my current job why and in which situation the LSI Logic SAS or Parallel makes more sense vs. the Paravirtual SCSI adapter (PVSCSI) and I didn’t really find a good blog article or KB explaining what I think is needed to really understand the differences. What I see in most of the environments is the standard adapter for the chosen Operation System and in many cases that is absolutely fine and works well. The problem starts when there is a limitation somewhere but how do find that out? Let’s start from the beginning. With the current version of ESXi 6.0 there are five options of SCSI Controllers which get illustrated in the following table:
SCSI controller comparison
Adapter Type
|
OS Type
|
Minimum Requirements
|
Maximum SCSI Adapters
|
Use Cases
|
BusLogic Parallel
|
Server
|
-
|
4
|
15 devices per controller, issues with 64-Bit OS, 2TB VMDK limit, VMware suggests to migrate off this adapter
|
LSI Logic Parallel (formerly LSI Logic)
|
Server/Desktop
|
-
|
4
|
15 devices per controller, required for Microsoft Clustering Service (older than Windows 2008)
|
LSI Logic SAS
|
Server/Desktop
|
HWv7
|
4
|
15 devices per controller, with most of OS the standard SCSI controller, required for MSCS (Windows 2008 or newer)
|
PVSCSI
|
Server
|
HWv7
|
4
|
15 devices per controller, lower CPU cost in many higher I/O use cases, suggested for high I/O use cases, no MSCS support for ESXi 5.5 U2 and lower
|
AHCI SATA
|
Server/Desktop
|
HWv10
|
4 (on top of the existing SCSI controller)
|
30 devices per controller, not recommended for high I/O environments, not as efficient as LSI Logic SAS or PVSCSI
|
Table 1: Adapter Types
Seeing that AHCI SATA controllers really don’t bring a huge benefit to the table this controller might could help if there is a need for a big amount of additional disks and where performance is not the main concern. You can add the AHCI SATA controller on top of the maximum supported SCSI controllers you already use.
What is good to know is the time since when the different adapters have been supported which gets illustrated in the the following table.
Feature
|
ESXi 6.0
|
ESXi 5.5
|
ESXi 5.1
|
ESXi 5.0
|
ESXi 4.x
|
ESXi 3.5
|
Hardware Version
|
11
|
10
|
9
|
8
|
7
|
4
|
Supported SCSI Adapters
|
BusLogic
LSI Parallel
LSI SAS
PVSCSI
AHCI
|
BusLogic
LSI Parallel
LSI SAS
PVSCSI
AHCI
|
BusLogic
LSI Parallel
LSI SAS
PVSCSI
|
BusLogic
LSI Parallel
LSI SAS
PVSCSI
|
BusLogic
LSI Parallel
LSI SAS
PVSCSI
|
BusLogic
LSI Logic
|
Table 2: ESXi Support
x86 Architecture
So I think you agree that the two really interesting adapters are the LSI Logic SAS and the PVSCSI adapter. What does this word “Paravirtual” even mean? Before getting into that let’s understand first where the drivers are sitting in the stack. In a x86 Architecture you will always find four levels or privileges. They are named Rings or or CPU modes and they are split into Ring 0 - 3. Traditional Operating Systems like Windows only use two Rings as at that time the available processors were not supporting more than two modes. These were Ring 3 for all applications as the least privileged and Ring 0 as the most privileged one. So every time the User Application want to use the hardware the CPU has to switch into user mode. If you like to learn more about kernel vs. user mode I suggest to read this blog article. In the following figure you see the traditional modes in a x86 architecture.
Figure 1: Traditional x86 Architecture
Binary translation using VMM
In a virtualized environment since the Hypervisor itself sits on top of the physical hardware, it becomes very difficult for a Guest VM OS to run in Ring 0 because the Ring 0 is now in use by the Hypervisor itself. What makes this even more complicated is the fact that some instruction are only able to get finished while running in Ring 0. So what to do now? VMware introduced certain binary translation techniques that allows the Virtual Machine Monitor (VMM) to run in Ring 0. This helps the VM because it now can execute these instructions with the help of VMM in Ring 0. For the application itself everything stays as it is. How that looks like you see in the following figure.
Figure 2: Binary Translation of OS Requests
Paravirtual implementation in ESXi
The name Paravirtual SCSI adapter is a bit of a wrong term here as all the virtual hardware in a Guest VM is paravirtual. The same is true for the VMXNET3 driver which is also a specific VMware driver. For both you need to install a driver in the Guest OS to be able to use this adapters. For the VMXNET3 driver this already happens when the VMware Tools get installed.
The paravirtual driver helps to get access to the ESXi kernel and does not need to communicate via the VMM to the system hardware. It does a "hypercall" to ESXi for certain critical operations like scheduling, interrupts and memory management. So it is pretty much a direct channel to the kernel from the driver perspective. The PVSCSI adapter in general offers better performance with lower CPU usage compared to the other SCSI controller options. But as with everything it depends. I will compare both the LSI Logic SAS and the PVSCSI a bit more in the next section. The following figure shows the general implementation of the "hypercalls" using a paravirtual adapter.
The paravirtual driver helps to get access to the ESXi kernel and does not need to communicate via the VMM to the system hardware. It does a "hypercall" to ESXi for certain critical operations like scheduling, interrupts and memory management. So it is pretty much a direct channel to the kernel from the driver perspective. The PVSCSI adapter in general offers better performance with lower CPU usage compared to the other SCSI controller options. But as with everything it depends. I will compare both the LSI Logic SAS and the PVSCSI a bit more in the next section. The following figure shows the general implementation of the "hypercalls" using a paravirtual adapter.
Figure 3: Paravirtual implementation in ESXi
Coalescing
Coalescing is a synonym for merge, join or assemble but what has that to do with a SCSI controller in a Guest VM? Very simple it optimises I/O in a very intelligent way. Let me point out what coalescing means:
- A technique for storage driver efficiency
- Coalescing can be thought of as buffering where multiple events are queue for simultaneous processing
- Improves efficiency and interrupts but I/O must stream fast enough to create a large batch request
- If an incoming stream of I/O is too low a timeout window will pass and the I/O will get an unnecessary delay
- Both the LSI Logic SAS & PVSCSI handle interrupt coalescing in two different ways:
- Outstanding I/O: VM demand for I/O
- IOPS: Storage system supply of I/O
Let's compare now how this works different with both adapters.
- LSI SAS: The driver increases coalescing as Outstanding I/O (OIO) and IOPS increase and no coalescing is used with few OIO or low throughput so the driver is very efficient where OIO and I/O throughput is small.
- PVSCSI: The driver coalesces bases on OIO only and not throughput. When the VM is doing a lot of I/O but the storage does not deliver right away the PVSCSI driver coalesces interrupts. Without the storage supplying a steady stream of I/O’s there are obviously no interrupts to coalesce. As a conclusion there is a slightly increased latency and no gain for the PVSCSI controller in low throughput environments.
- CPU Cost: The difference between the LSI Logic SAS and PVSCSI controller at very low IOPS is not measurable but with larger numbers of IOPS the PVSCSI controller saves a huge amount of CPU cycles.
VMware says that they think everything with 2.000 IOPS peak performance and 4 OIO is a good reason due to this KB 1017652. They suggest to use the PVSCSI adapter in later versions of ESXi also with lower IOPS and OIO requirements.
Queue Depth
To maximize performance virtual disks should be distributed across multiple vSCSI adapters. A maximum of 4 vSCSI adapter can be configured per VM with a maximum of 15 vDisks per vSCSI adapter. By using multiple vSCSI adapters you open up more I/O queues. The following table shows the queue depths when using the PVSCSI adapters as compared to the LSI Logic SAS adapter.
Queue
|
PVSCSI
|
LSI Logic SAS
|
Default Adapter Queue Depth
|
256
|
128
|
Maximum Adapter Queue Depth
|
1024
|
128
|
Default Virtual Disk Queue Depth
|
64
|
32
|
Maximum Virtual Disk Queue Depth
|
254
|
32
|
Table 3: Queue Depth
Tuning of PVSCSI
There is a very good KB: VMware KB 2053145 you should follow to understand how to tune the different values in ESXi itself as well as the VM. Two settings can be tuned:
- Adjust the queue depth for the HBAs on the ESXI host
- Increase the PVSCSI queue inside the Windows or Linux guest
Note: The default number of ring pages is 8, each of which is 4 KB in size. One entry of 128 bytes is needed for a queued I/O, which means that 32 entries on a single page of 4096 bytes, resulting in 256 entries (8x32). The Windows PVSCSI driver adapter queue was hard coded in previous releases, but it can be adjusted up to the maximum of 32 pages since the versions delivered with VMware Tools.
This note I got out of the KB itself but what does that mean? Let me explain what Ring Pages and Queue Depth really stands for as it often gets understood wrong.
Ring Pages:
The 128 byte entry describes the I/O. It is not the actual memory to which the I/O is being directed to/from. In other words, the pages that are used for ring buffer are used to describe the actual I/O operation. They are not used for the actual I/O itself. Think of them as having pointers to the pages that will be used for DMA operation. A portion of a page (non-ring pages) may be used for one I/O and another portion (possibly even overlapping) may be used for another I/O. So the latter is possible.
Queue Depth:
The queue depth is a number that in the case of PVSCSI reflects the limits of the adapter.
The adapter uses 8 ring pages and thus can support a queue depth of 256. It is really an artificial number since PVSCSI is not a real device because it is a VMware paravirtualized SCSI device. However for other devices (real adapters), it reflects an actual HW limit. The ring is in the hardware and it has a limit and hence the queue depth! The queue depth is per adapter. So if you have 4 PVSCSI or any LSI adapter for that matter you will get 4 * Queue Depth. As a consequence, there will be 4 * Ring Pages as well.
LSI Logic:
It is a similar construct. The only difference is that the hardware controller has an upper limit based on real queue resources on the HBA. However, the driver may artificially allow you to use a lower value than what the hardware can support. If the driver lets you set it to a higher value than what the HBA claims then it will be queued in the driver instead of being queued in the ESXi SCSI mid-layer queues. It will be accounted as part of the queueing delays in that case.
Conclusion
So what is the conclusion here now? I would say it depends on what you need as with everything in life. My perspective though is that you rather use the PVSCSI controller more often than less. In my ex-company we were running about 1500+ VMs. At one point in I guess 2010 where we had 80-90% virtualized I thought to myself how the hell can we find out where the LSI and where the PVSCSI controller makes sense. Knowing that at that time the monitoring tools were much worse then today I decided being the Infrastructure Architect that we use the PVSCSI controller in every template and no matter if it was a small side with a single ESXi server and 3 VMs or the big ERP systems in the centralized datacenters we were using the same configuration everywhere. In existing VMs we replaced the existing SCSI controller with the PVSCSI adapter then in the next maintenance window. Maybe at the end it was not beneficial for all the small VMs with very low I/O but for sure it helped with the VMs which had the requirement of higher I/O. Obviously this is something I decided but at the end in my opinion it was the right decision as downtime to change a SCSI controller in a VM at the end is always cost.
Tuning on the other side is something what I would do very specific as it depends on how your storage system performs because otherwise it simply does not make any difference to tune these parameters. A few years back I always thought adapting the queue depth on the controller or SCSI controller will always help improving performance but that really depends what you storage system and the stack in between the server and storage can deliver. If there is nobody taking the I/O out of the queue it does not make sense to fill it up with many I/Os but if the storage system performs very well you most likely can solve your bottleneck on the VM side with this. What means the storage system performs very well? That is something you have to find out with testing and tweaking. The easiest is use the local OS tools like Perfmon in Windows and see if you average disk queue length is always at the limit of the adapter. Also the number of additional PVSCSI adapters really only has a benefit when the the storage is sufficient to handle it. I remember I had one colleague who always wanted to configure as many as possible separate drives and controllers to spread the load as much as possible but if your storage simply is the bottleneck it just complicates the configuration. So more is not always more keep it simple to make it best working for 98% of your systems and focus then on the last 2% to tweak them if there is a need.
If you have any questions please let me know.
Tuning on the other side is something what I would do very specific as it depends on how your storage system performs because otherwise it simply does not make any difference to tune these parameters. A few years back I always thought adapting the queue depth on the controller or SCSI controller will always help improving performance but that really depends what you storage system and the stack in between the server and storage can deliver. If there is nobody taking the I/O out of the queue it does not make sense to fill it up with many I/Os but if the storage system performs very well you most likely can solve your bottleneck on the VM side with this. What means the storage system performs very well? That is something you have to find out with testing and tweaking. The easiest is use the local OS tools like Perfmon in Windows and see if you average disk queue length is always at the limit of the adapter. Also the number of additional PVSCSI adapters really only has a benefit when the the storage is sufficient to handle it. I remember I had one colleague who always wanted to configure as many as possible separate drives and controllers to spread the load as much as possible but if your storage simply is the bottleneck it just complicates the configuration. So more is not always more keep it simple to make it best working for 98% of your systems and focus then on the last 2% to tweak them if there is a need.
If you have any questions please let me know.
Sources and inspirations:
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1017652
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1010398
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053145
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1267
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1010398
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2053145
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1267
Blogs:
https://blogs.vmware.com/vsphere/2014/02/vscsi-controller-choose-performance.html
https://pelicanohintsandtips.wordpress.com/2015/09/23/vmware-paravirtual-scsi-adapter-pvscsi/
https://clearwaterthoughts.wordpress.com/2011/05/06/virtual-scsi-adapters-vs-para-virtual-scsi-pvscsi-adapters-vs-vm-direct-path-io/
https://pelicanohintsandtips.wordpress.com/2015/09/23/vmware-paravirtual-scsi-adapter-pvscsi/
https://clearwaterthoughts.wordpress.com/2011/05/06/virtual-scsi-adapters-vs-para-virtual-scsi-pvscsi-adapters-vs-vm-direct-path-io/
I really appreciate the information shared above. It’s of great help. If someone wants to learn Online (Virtual) instructor lead live training in VMware TECHNOLOGY, kindly Contact MaxMunus
ReplyDeleteMaxMunus Offer World Class Virtual Instructor-led training on TECHNOLOGY. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 1,00,000 + training in India, USA, UK, Australia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
For Demo Contact us.
Pratik Shekhar
MaxMunus
E-mail: pratik@maxmunus.com
Ph:(0) +91 9066268701
www.MaxMunus.com
Great stuff!!! explained very well :)!!! thank you.. keep up the good work..
ReplyDeleteVMWare has gotten extremely popular lately and there are several aspiring software engineers looking to master this Domain. If you wish to master this, you could choose from several VMware Training Institutes.
ReplyDeleteThe problem starts when there is a limitation somewhere but how do find that out? Let’s start from the beginning. With the current version of ESXi 6.0 there are five options of SCSI Controllers which get illustrated in the following table: discount wedding bands
ReplyDeleteIm a frequent visitor of your blog as well. Appreciate your work.
ReplyDeleteIts a wonderful post and very helpful, thanks for all this information.
ReplyDeleteVmware Training institute in Noida
Thanks for sharing this with us
ReplyDeleteVmware training
Vmware certification
Vmware course
Thanks for sharing. This is a very technical post. couple rings
ReplyDeleteIt is a similar construct. The only difference is that the hardware controller has an upper limit based on real queue resources on the HBA. swarovski philippines
ReplyDeleteYour website is very beautiful or Articles. I love it thank you for sharing for everyone. UCAT Sample Questions
ReplyDeleteNice info.
ReplyDeleteVisit Best Astrologer in Mangalore.
Cryptocurrency App Development Company
ReplyDeleteis a subset of digital currency designed
to serve as an exchange mechanism for secure online transactions via cryptography. Bitcoin is the world's first decentralized record currency. It has been a huge success, with a good response from the global community. It has prepared the path for the emergence of new cryptocurrencies. These digital currencies are also known as Bitcoin forks.
With our easy-to-use and cleverpaxful clone script development, Omninos gives you the ability to trade from anywhere at any time. We've created a useful app for both Android and iOS that meets your requirements. Our talented programmers produce user-friendly apps that facilitate global commerce and attract a significant number of new traders.
ReplyDeleteuPVC represents unplasticized polyvinyl chloride, a hard plastic material regularly utilized in windows and entryways. The material is impervious to contamination, dampness, consumption, and form. This makes uPVC windows a more drawn out enduring uPVC Doors Suppliers in Delhi NCR option in contrast to wood and aluminum windows. Primary Trustworthiness. Regardless of their power, uPVC windows and entryways are inclined to drooping and sashing because of their lightweight and furthermore on the grounds that they are basically not so solid as aluminum windows. An excessive amount of intensity could bring about the break of their edges.
ReplyDeleteEffortlessly swap your Bitcoin (BTC) to USDT on Trust Wallet with our user-friendly guide. Whether you're a seasoned crypto enthusiast or new to the space, our step-by-step instructions will guide you through the process. Start by opening your Trust Wallet and ensuring it's connected to a platform that supports the desired swap. Navigate to the exchange or swap section within the Trust Wallet app. Choose BTC as the currency to swap and select USDT as the target. Confirm the transaction details, review the exchange rate, and proceed with the swap.
ReplyDeleteOur guide ensures a smooth experience, covering essential details from selecting the right exchange to securely storing your TOAD tokens in a compatible wallet. Dive into the quick Purchase Toad Killer: A Quick Guide today and make well-informed decisions in your crypto investment journey.
ReplyDeleteUnlock the Benefits of Transferring Crypto from Coinbase to Binance with our insightful guide. Explore the benefits of this strategic move, from diverse trading pairs to potentially lower fees. Whether you're seeking enhanced liquidity or accessing unique investment opportunities, our guide highlights the advantages. Take control of your crypto journey – delve into our guide today and discover the benefits of transferring crypto from Coinbase to Binance for a more optimized trading experience.
ReplyDeleteComparing Coinbase and Binance has been eye-opening. While Coinbase offers a user-friendly experience and a range of cryptocurrencies, Binance impresses with lower fees and a vast selection. Both have unique strengths, and choosing between them depends on individual preferences and trading needs. What's your preferred exchange, and why?
ReplyDelete