When either directly passing through a GPU, or attaching an NVIDIA vGPU to a Virtual Machine on VMware ESXi that has more than 16GB of Video Memory, you may run in to a situation where the VM fails to boot with the error “Module ‘DevicePowerOn’ power on failed.”. Special considerations are required when performing GPU or vGPU Passthrough with 16GB+ of video memory.
This issue is specifically caused by memory mapping a GPU or vGPU device that has 16GB of memory or higher, and could involve both the host system (the ESXi host) and/or the Virtual Machine configuration.
In this post, I’ll address the considerations and requirements to passthrough these devices to virtual machines in your environment.
In the order of occurrence, it’s usually VM configuration related, however if the recommendations in the “VM Configuration Considerations” section do not resolve the issue, please proceed to reviewing the “ESXi Host Considerations” section.
Please note that if the issue is host related, other errors may be present, or the device may not even be visible to ESXi.
VM GPU and vGPU Configuration Considerations
First and foremost, all new VMs should be created using the “EFI” Firmware type. EFI provides numerous advantages in device access and memory mapping versus the older style “BIOS” firmware types.
To do this, create a new virtual machine, navigate to “VM Options”, expand “Boot Options”, and confirm/change the Firmware to “EFI”. I recommend this for all new VMs, and not only for VMs accessing GPUs or vGPUs with over 16GB of memory. Please note that you shouldn’t change an existing VM, and should do this on a fresh new VM.
With performing GPU or vGPU Passthrough with 16GB+ of video memory, you’ll need to create a couple of entries under “Advanced” settings to properly configure access to these PCIe devices and provide the proper environment for memory mapping. The lack of these settings is specifically what causes the “Module ‘DevicePowerOn’ power on failed.” error.
Under the VM settings, head over to “VM Options”, expand “Advanced” and click on “Edit Configuration”, click on “Add Configuration Params”, and add the following entries:
You’ll notice that while our GPU or vGPU profile may have 16GB of memory, we need to double that value, and set it for the “pciPassthru.64bitMMIOSizeGB” variable. If your card or vGPU profile had 32GB, you’d set it to “64”.
Additionally if you were passing through multiple GPUs or vGPU devices, you’d need to factor all the memory being mapped, and double the combined amount.
ESXi GPU and vGPU Host Considerations
On most new and modern servers, the host level doesn’t require any special configuration as they are already designed to pass through such devices to the hypervisor properly. However in some special cases, and/or when using older servers, you may need to modify configuration and settings in the UEFI or BIOS.
If setting the VM Configuration above still results in the same error (or possibly other errors), than you most likely need to make modifications to the ESXi hosts BIOS/UEFI/RBSU to allow the proper memory mapping of the PCIe device, in our case being the GPU.
This is where things get a bit tricky because every server manufacturer has different settings that will need to be configured.
Look for the following settings, or settings with similar terminology:
“Memory Mapping Above 4G”
“Above 4G Decoding”
“PCI Express 64-Bit BAR Support”
“64-Bit IOMMU Mapping”
Once you find the correct setting or settings, enable them.
Every vendor could be using different terminology and there may be other settings that need to be configured that I don’t have listed above. In my case, I had to go in to a secret “SERVICE OPTIONS” menu on my HPE Proliant DL360p Gen8, as documented here.
After performing the recommendations in this guide, you should now be able to passthrough devices with over 16GB of memory.
With VMware ESXi 6.5 and 6.7 going End of Life on October 15th, 2022, many of you are looking for options to update hosts in your homelab, especially in my case putting ESXi 7.0 on HP Proliant DL360p Gen8 servers.
As far as support goes, HPE last provided a custom installer for ESXi for versions 6.5 U3 which was released December of 2019. This was the “last Pre-Gen9 custom image” released, as ESXi 7.0 on the DL360p Gen8 is totally unsupported.
ESXi 6.7 or higher on the Gen8 Servers
The jump from 6.5 to 6.7 was a little easier, as you could use the 6.5 custom installer, and then upgrade to 6.7. For the most part, as long as the hardware itself was supported, you were in pretty good shape.
Additionally, with the HPE vibsdepot loaded in to VMware Update Manager (now known as Lifecycle Manager), you could also keep all the HPE drivers and agents up to date.
ESXi 7.0 on the Gen8 Servers
Some were lucky enough to upgrade their current installs to 7 with no or limited problems, however the general consensus online was to expect problems. There were some major driver changes, which I think at one point led to an advisory to perform a fresh install, even if you had a fully supported configuration with newer generation servers such as the Proliant Gen9 and Gen10 servers, when upgrading from older versions.
In my setup, I have the following:
2 x HPE Proliant DL360p Gen8 Servers
Dual Intel Xeon E5-2660v2 Processors in each server
USB and/or SD for booting ESXi
No other internal storage
External SAN iSCSI Storage
Concerns and Considerations
My main concern is to not only have a stable and functioning ESXi 7 instance, but I also, if possible would like to have the HPE drivers, agents, and integrations with iLO.
You must consider that while this is completely unsupported, you do need to make sure that the components of your current configuration are supported, such as the processor and PCIe cards, even if the server as a whole is not supported.
Boot server, install using the Generic Installer downloaded above.
Mount NFS or iSCSI datastore.
Copy HPE Custom Addon for ESXi zip file to datastore.
Enable SSH on host (or use console).
Place host in to maintenance mode.
Run “esxcli software vib install -d /vmfs/volumes/datastore-name/folder-name/HPE-703.0.0.10.9.1.5-Jul2022-Addon-depot.zip” from the command line.
The install will run and complete successfully.
Restart your server as needed, you’ll now notice that not only were HPE drivers installed, but also agents like the Agentless management agent, and iLO integrations.
You’ll now have a functioning instance.
In my case everything was working, except for the “Smart Array P420i” RAID Controller, which I don’t use anyways.
Additionally, if you have a vCenter instance running, make sure that you add the HPE vibsdepot repo to your Lifecycle Manager. After you add the repo, create a baseline, and attach the baseline to the host, go ahead and proceed to scan, stage, and remediate the server which will then further update all the HPE specific drivers and software.
When it comes to troubleshooting login times with non-persistent VDI (VMware Horizon Instant Clones), I often find delays associated with printer drivers not being included in the golden image. In this post, I’m going to show you how to add a printer driver to an Instant Clone golden image!
Printing with non-persistent VDI and Instant Clones
In most environments, printers will be mapped for users during logon. If a printer is mapped or added and the driver is not added to the golden image, it will usually be retrieved from the print server and installed, adding to the login process and ultimately leading to a delay.
Due of the nature of non-persistent VDI and Instant Clones, every time the user goes to login and get’s a new VM, the driver will then be downloaded and installed each of these times, creating a redundant process wasting time and network bandwidth.
To avoid this, we need to inject the required printer drivers in to the golden image. You can add numerous drivers and should include all the drivers that any and all the users are expecting to use.
An important consideration: Try using Universal Print Drivers as much as possible. Universal Printer Drivers often support numerous different printers, which allows you to install one driver to support many different printers from the same vendor.
How to add a printer driver to an instant clone golden image
Below, I’ll show you how to inject a driver in to the Instant Clone golden image. Note that this doesn’t actually add a printer, but only installs the printer driver in to the Windows operating system so it is available for a printer to be configured and/or mapped.
Let’s get started! In this example we’ll add the HP Universal Driver. These instructions work on both Windows 10 and Windows 11 (as well as Windows Server operating systems):
Click Start, type in “Print Management” and open the “Print Management”. You can also click Start, Run, and type “printmanagement.msc”.
On the left hand side, expand “Print Servers”, then expand your computer name, and select “Drivers”.
Right click on “Drivers” and select “Add Driver”.
When the “Welcome to the Add Printer Driver Wizard” opens, click Next.
Leave the default for the architecture. It should default to the architecture of the golden image.
When you are at the “Printer Driver Selection” stage, click on “Have Disk”.
Browse to the location of your printer driver. In this example, we navigate to the extracted HP Universal Print Driver.
Select the driver you want to install.
Click on Finish to complete the driver installation.
The driver you installed should now appear in the list as it has been installed in to the operating system and is now available should a user add a printer, or have a printer automatically mapped.
Now seal, snap, and deploy your image, and you’re good to go!
Many of you may be not aware of the Azure AD Connect 1.x End of Life on August 31st, 2022. What this means is that as of August 31st, 2022 (later this month), you’ll no longer be able to use Azure AD Connect 1.4 or Azure AD Connect 1.6 to sync your on-premise Active Directory to Azure AD.
It’s time to plan your upgrade and/or migration!
This is catching a lot of System Administrators by surprise. In quite a few environments, Azure AD connect was implemented on older servers that haven’t been touched (except for Windows Updates) in the years that they’ve been running, because Azure AD Connect “just works”.
Azure AD Connect End of Life
Azure AD Connect has to major releases that are being used right now, being 1.x and 2.x.
Version 1.x which is the release going end of life is the first release, generally seen installed on older Windows Server 2012 R2 systems (or even earlier versions).
Version 2.x which is the version you *should* be running, does not support Windows Server 2012. Azure AD Connect 2.x can only be deployed on Windows Server 2016 or higher.
For a lot of you, there is no easy in-place upgrade unless you have 1.x installed on Windows Server 2016 or higher. If you are running 1.x on Server 2016 or higher, you can simply do an in-place upgrade!
If you’re running Windows Server 2012 R2 or earlier, because 2.x requires Server 2016 or higher, you will need to migrate to another system running a newer version of Windows Server.
However, the process to migrate to a newer server is simpler and cleaner than most would suspect. I highly recommend reviewing all the Microsoft documentation (see below), but a simplified overview of the process is as follows:
Deploy new Windows Server (version 2016 or higher)
Export Configuration (JSON file) from old Azure AD Connect 1.x server
Install the latest version of Azure AD Connect 2.x on new server, load configuration file and place in staging mode.
Enable Staging mode on old server (this stops syncing of old server)
Disable Staging mode on new server (this starts syncing of new server)
Decommission old server (uninstall Azure AD Connect, unjoin from domain)
As always, I highly recommend having an “Alternative Admin” account on your Azure AD. If you lose the ability to sync or authenticate against Azure AD, you’ll need a local Azure AD admin account to connect and manage and re-establish the synchronization.
As some of you know, I regularly write about virtualization technologies, in particular VMware. VMware products are not only involved in the work that I do, but part of a personal hobby and passion. I was an early adopter of Virtualization, and on top of that, VDI (Virtual Desktop Infrastructure) has become a personal obsession of mine.
Because of the content I’ve written online, I’ve had the pleasure of helping others with these technologies. Over the years this has brought me new friendships, business customers, and given me a sense of participation in the larger community, ultimately leading to me achieving my VMware vExpert status, as well as being a part of the VMware vExpert EUC sub program.
Even though I’ve been in tech since becoming an adult, I’ve actually never had the opportunity to visit a large-scale conference in person in my entire life. VMware Explore 2022 will be my first in-person tech conference!
So why am I going? What do I hope to get from it? What are my reasons for attending?
Essentially there’s 3 big reasons why I’m going to be attending:
Let’s dive in to each one…
As mentioned above, I’ve had the pleasure of being a part of the VMware vExpert program for the past couple years. During this time, it has helped my content reach new audiences, I’ve had the chance to converse and talk with the top industry experts, I’ve also had the chance to learn more about the technologies I love, and it’s given a sense of belonging and participating in something “big”.
Blogging has been a passion of mine for as long as I can remember, with the first post on this blog going back to April 11th, 2010. Blogging has allowed me to not only share my knowledge, but also participate and contribute to the community. This has helped me meet new people, network, learn even more, and also help others pursue their passions and goals with technology.
Attending VMware Explore 2022 will help me take this a step further to actually meet some of those in the community face to face. I love meeting new people, and this will allow me to engage with those who have stumbled across my blog, and it will also allow me to meet those who are leaders with the community and hopefully even learn some new things from them.
I’ve already started working on my list of people to meetup with!
In addition to the knowledge I hope to learn from others in the community, VMware Explore 2022 has over 600 technical sessions (some even hosted by fellow vExperts) where you can learn more about the technologies you use everyday, as well as technologies you’re considering or planning on using in the future.
In particular, a few products and solutions I want to increase my knowledge with are:
VMware Workspace One
VMware Horizon Cloud Service
In addition to the above, I’m sure I’ll be expanding my knowledge on things I wasn’t even planning on… You could say the point of the conference is to “Explore”!
VMware products and solutions have been an important part of the solutions and offerings my business provides. In addition, those products and solutions are also the foundations of many businesses and organizations key IT infrastructure.
These conferences are great to network, discuss business, find new potential clients and vendors, and also connect with those that you already do business with!
In the last 4 years the amount of international consulting I’ve been providing has increased exponentially on a year over year basis. And while it’s been amazing experience and I’ve had the chance to help many organizations with their VMware infrastructure, the only complaint I have is that I can’t meet face-to-face and shake hands with those customers as much as I’d like to. We have Zoom and Teams, but it’s not the same thing…
One thing I’m really looking forward to, is finally meeting quite a few of those customers face-to-face for the first time. I’m sure we’ll even have a few stories after attending a few (or many) of the VMware Explore parties that happen during that week.
Additionally, many major vendors sponsor VMware Explore and will have booths at the event, so I’m looking forward to meeting and shaking hands with some of my favorite vendors!
All in all, I think it’s going to be a great time and I’m really excited to attend. I hope to see you there!
I purchased the new Lenovo X13s Windows on ARM laptop, and wanted to share my first impressions with the device. I plan on creating a full review in a later post, however I wanted to provide some insight on my initial first impressions, as these can be a game changer or deal breaker for most people considering purchasing this laptop.
I’m going to break this blog post up in to a few key sections that were the most important, and most noticeable when first getting my hands on this device.
I’ll be limiting this post to the first impressions as much as possible saving the rest for the full review.
Pre-purchase expectations and initial thoughts
With lots of travel approaching, and with an aging laptop (Lenovo X1 Carbon Gen-2013), I needed to purchase a new laptop that I could use that would fit my requirements:
WWAN (Preferably 5G)
Good Battery Life
VDI – VMware Horizon Client
IT Applications (Putty, WinSCP, RDP)
You can see that my usage is similar to the business road warrior professional, with an IT add-on. I’m usually always connected to a VDI session, and also spend 50-100% of the day on Zoom or Microsoft Teams meetings.
With full knowledge about ARM architecture, and the new laptops and devices that have been released, I decided to take a big risk and try one of the new Windows on ARM laptops, specially the Lenovo X13s.
ARM laptops generally provide great performance, really good battery life, and an “always on” ready to go environment.
I’ll be saving the tech spec deep dive for the full review, however I wanted to provide some basic information on the specifications of the model I purchased.
Part Number: 21BX0008US
CPU: Snapdragon® 8cx Gen 3 Compute Platform (3.00 GHz up to 3.00 GHz)
I specifically wanted a large SSD, lots of RAM, and definitely the 5G WWAN modem built in. I purchased the highest configured model without going custom (to take advantage of special pricing and promotions).
Receiving the laptop, the first things that really stick out are the size, texture (quality of materials), thinness, and no fan ports. It’s a very beautifully designed laptop.
While it is smaller than I expected, it does not feel cheap. The materials used with this laptop give it the same quality and feel as the X1 Carbon.
For whatever reasons, I was expecting something the same size as my original X1 Carbon, however the X13s is thinner and has a slightly smaller width and height in comparison.
Originally I thought this was going to be a problem, but after using the laptop, I’m absolutely in love with the size of this. As far as portability and usability, based on first impressions, this thing has both!
Surprisngly, because of the smaller size of the laptop, I’ve actually found is very easy to type quickly. I’ve noticed that on all the of laptops I’ve owned, as well as desktop keyboards, I can type the fasted on the X13s, because of the size of the keyboard as well as the layout and feel.
Keystrokes feel and sounds amazing, with a perfectly built keyboard. I honestly have no complaints…
The display is absolutely beautiful. Even though I thought there is an option for a 400-knit display, my model has the 300-knit because I wanted the touchscreen.
Visibility in my apartment with all the windows open on a sunny day, I can see everything crisply on this display.
The only thing I noticed is that when viewing black/gray scale content (most of my UI and apps are in dark mode), it looks like the backlight dims and sometimes text becomes faded. You can still see everything fine, however this causes for an odd effect when the screen content changes to something with white or color.
To fix this, uncheck “Help improved battery by optimizing the content shown and brightness” in settings:
After unchecking this option, everything is perfect!
The battery on this unit is absolutely blowing my mind. In 4 days of usage, I’ve never used a laptop that can hold up to this and barely use any battery.
Comparing this to my old X1 in 4 days of usage, I probably would have had to charge it 3-4 times. The X13s just keeps going and going and going.
Very impressed with this, as it’s going to help with travel and staying connected on the go.
Speakers and Sound
The sound is fantastic, and playing music sounds great. The laptop includes a sound system enhanced with Dolby.
I’m not much of an audiophile, but I have to say I was impressed with the volume and quality of audio that comes from the laptop.
This laptop has no fans or air ducts. One would think this would make up for a laptop that runs up hot, but I have to say I haven’t really noticed any hot temperatures except for when I first booted it up and did Windows Updates, Lenovo Updates, Microsoft Office installer, and a bunch of other things.
Even under extremely heavy load during the installs, the heat generated was actually less than what I would have expected, or experienced with my old Lenovo X1 Carbon.
Windows 11 for ARM64 (Windows on ARM)
For the most part, if you didn’t understand what Windows on ARM was, processor architectures, or the difference between this laptop and others, you’d notice absolutely nothing different from a normal laptop (except maybe if you were gaming).
I have to say that Microsoft knocked it out of the park with the development of Windows 11 on ARM, and it’s definitely 100% ready for primetime use, both for regular users as well as enterprise/business users.
The one thing I can’t comment on is gaming. While I haven’t done any testing (as I don’t game much), there may be additional considerations as far as stability and performance, or even capabilities of gaming.
When it comes to applications, while the X13s does support x86 and x64 emulation, you should always try to run native ARM/ARM64 applications. Running applications native to the architecture will provide the best performance as well as battery life.
After getting going, I noticed the following applications had native ARM64 support:
Edge (built off Chromium)
I also loaded numerous applications that are x86/x64 and emulated:
VMware Horizon Client
All the above applications, both ARM and x86/x64 run fantastic without any problems. I was concerned that the whole emulation error would be a mess but I’ve seriously had no problems.
I can’t say enough how snappy Windows 11 on ARM and the X13s is. I never thought I’d say it, but this is the fastest performing Windows 11 system I’ve used when it comes to responsiveness of the OS and applications.
The built-in 5G connectivity was super easy to setup. The laptop can use an eSIM or traditional physical SIM. I had the experience of using both at different points (because of issues with my cell phone provider).
The eSIM was super easy to setup and you can manage multiple different profiles. I simply purchased an eSIM, and scanned the QR code with the webcam.
When I had to switch to the physical SIM (because my provider doesn’t support 5G with eSIMs), I simply popped the SIM tray and install the card.
It’s very easy to not only switch between eSIM profiles, but also switch between the eSIM and normal SIM. This is great if you’re travelling to other countries as you can easily switch between your local providers eSIM, and install a foreign SIM to use local data.
You speed will vary depending on provider, but I was able to achieve full speed that was expected my provider, and I was pleasantly surprised with better than expected low latencies, which is great for VDI which I use regularly.
Because of the ARM processor, Windows is “always on”. There’s no resume from suspend time, just like your ARM based cell/mobile phone.
The laptop is virtually always on and ready to go when I need to work.
Overall First Impressions
Overall, my first impressions with this laptop have been fantastic and this laptop is exceeding my best expectations. Windows 11 on ARM is definitely a serious contender when it comes to choosing the right laptop/notebook.
The OS is snappy, everything works the way you’d expect on Windows, and so far I’m very happy with the investment I made when purchasing this laptop. I can’t wait to do some travelling with this to start using it to it’s full potential.
Add in 5G always-on connectivity, and it feels like this thing is unstoppable…
It’s been coming for a while: The requirement to deploy VMs with a TPM module… Today I’ll be showing you the easiest and quickest way to create and deploy Virtual Machines with vTPM on VMware vSphere ESXi!
As most of you know, Windows 11 has a requirement for Secureboot as well as a TPM module. It’s with no doubt that we’ll also possibly see this requirement with future Microsoft Windows Server operating systems.
While users struggle to deploy TPM modules on their own workstations to be eligible for the Windows 11 upgrade, ESXi administrators are also struggling with deploying Virtual TPM modules, or vTPM modules on their virtualized infrastructure.
What is a TPM Module?
TPM stands for Trusted Platform Module. A Trusted Platform Module, is a piece of hardware (or chip) inside or outside of your computer that provides secured computing features to the computer, system, or server that it’s attached to.
This TPM modules provides things like a random number generator, storage of encryption keys and cryptographic information, as well as aiding in secure authentication of the host system.
In a virtualization environment, we need to emulate this physical device with a Virtual TPM module, or vTPM.
What is a Virtual TPM (vTPM) Module?
A vTPM module is a virtualized software instance of a traditional physical TPM module. A vTPM can be attached to Virtual Machines and provide the same features and functionality that a physical TPM module would provide to a physical system.
vTPM modules can be can be deployed with VMware vSphere ESXi, and can be used to deploy Windows 11 on ESXi.
Deployment of vTPM modules, require a Key Provider on the vCenter Server.
Deploying vTPM (Virtual TPM Modules) on VMware vSphere ESXi
In order to deploy vTPM modules (and VM encryption, vSAN Encryption) on VMware vSphere ESXi, you need to configure a Key Provider on your vCenter Server.
Traditionally, this would be accomplished with a Standard Key Provider utilizing a Key Management Server (KMS), however this required a 3rd party KMS server and is what I would consider a complex deployment.
VMware has made this easy as of vSphere 7 Update 2 (7U2), with the Native Key Provider (NKP) on the vCenter Server.
The Native Key Provider, allows you to easily deploy technologies such as vTPM modules, VM encryption, vSAN encryption, and the best part is, it’s all built in to vCenter Server.
Enabling VMware Native Key Provider (NKP)
To enable NKP across your vSphere infrastructure:
Log on to your vCenter Server
Select your vCenter Server from the Inventory List
Select “Key Providers”
Click on “Add”, and select “Add Native Key Provider”
Give the new NKP a friendly name
De-select “Use key provider only with TPM protected ESXi hosts” to allow your ESXi hosts without a TPM to be able to use the native key provider.
In order to activate your new native key provider, you need to click on “Backup” to make sure you have it backed up. Keep this backup in a safe place. After the backup is complete, you NKP will be active and usable by your ESXi hosts.
There’s a few additional things to note:
Your ESXi hosts do NOT require a physical TPM module in order to use the Native Key Provider
Just make sure you disable the checkbox “Use key provider only with TPM protected ESXi hosts”
NKP can be used to enable vTPM modules on all editions of vSphere
If your ESXi hosts have a TPM module, using the Native Key Provider with your hosts TPM modules can provide enhanced security
Onboard TPM module allows keys to be stored and used if the vCenter server goes offline
If you delete the Native Key Provider, you are also deleting all the keys stored with it.
Make sure you have it backed up
Make sure you don’t have any hosts/VMs using the NKP before deleting
You can now deploy vTPM modules to virtual machines in your VMware environment.
We all know that vMotion is awesome, but what is even more awesome? Optimizing VMware vMotion to make it redundant and faster!
vMotion allows us to migrate live Virtual Machines from one ESXi host to another without any downtime. This allows us to perform physical maintenance on the ESXi hosts, update and restart the hosts, and also load balance VMs across the hosts. We can even take this a step further use DRS (Distributed Resource Scheduler) automation to intelligently load the hosts on VM boot and to dynamically load balance the VMs as they run.
In this post, I’m hoping to provide information on how to fully optimize and use vMotion to it’s full potential.
Most of you are probably running vMotion in your environment, whether it’s a homelab, dev environment, or production environment.
I typically see vMotion deployed on the existing data network in smaller environments, I see it deployed on it’s own network in larger environments, and in very highly configured environments I see it being used with the vMotion TCP stack.
While you can preform a vMotion with 1Gb networking, you certainly almost always want at least 10Gb networking for the vMotion network, to avoid any long running VMs. Typically most IT admins are happy with live migration vMotion’s in the seconds, and not the minutes.
VMware vMotion Optimization
So you might ask, if vMotion is working and you’re satisfied, what is there to optimize? There’s actually a few things, but first let’s talk about what we can improve on.
We’re aiming for improvements with:
Migrate more VMs
Evacuate hosts faster
Enable more aggressive DRS
Migrate many VMs at once very quickly
Redundant vMotion Interfaces (NICs and Uplinks)
More Complex vMotion Configurations
vMotion over different subnets and VLANs
vMotion routed over Layer 3 networks
To achieve the above, we can focus on the following optimizations:
Enable Jumbo Frames
Saturation of NIC/Uplink for vMotion
Use of the vMotion TCP Stack
Let’s get to it!
Enable Jumbo Frames
I can’t stress enough how important it is to use Jumbo Frames for specialized network traffic on high speed network links. I highly recommend you enable Jumbo Frames on your vMotion network.
Note, that you’ll need to have a physical switch and NICs that supports Jumbo frames.
In my own high throughput testing on a 10Gb link, without using Jumbo frames I was only able to achieve transfer speeds of ~6.7Gbps, whereas enabling Jumbo Frames allowed me to achieve speeds of ~9.8Gbps.
When enabling this inside of vSphere and/or ESXi, you’ll need to make sure you change and update the applicable vmk adapter, vSwitch/vDS switches, and port groups. Additionally as mentioned above you’ll need to enable it on your physical switches.
Saturation NIC/Uplink for vMotion
You may assume that once you configure a vMotion enabled NIC, that when performing migrations you will be able to fully saturate it. This is not necessarily the case!
When performing a vMotion, the vmk adapter is bound to a single thread (or CPU core). Depending on the power of your processor and the speed of the NIC, you may not actually be able to fully saturate a single 10Gb uplink.
In my own testing in my homelab, I needed to have a total of 2 VMK adapters to saturate a single 10Gb link.
If you’re running 40Gb or even 100Gb, you definitely want to look at adding multiple VMK adapters to your vMotion network to be able to fully saturate a single NIC or Uplink.
You can do this by simply configuring multiple VMK adapters per host with different IP addresses on the same subnet.
One important thing to mention is that if you have multiple physical NICs and Uplinks connected to your vMotion switch, this change will not help you utilize multiple physical interfaces (NICs/Uplinks). See “Multi-NIC/Uplink vMotion”.
Please note: As of VMware vSphere 7 Update 2, the above is not required as vMotion has been optimized to use multiple streams to fully saturate the interface. See VMware’s blog post “Faster vMotion Makes Balancing Workloads Invisible” for more information.
Another situation is where we may want to utilize multiple NICs and Uplinks for vMotion. When implemented correctly, this can provide load balancing (additional throughput) as well as redundancy on the vMotion network.
If you were to simply add additional NIC interfaces as Uplinks to your vMotion network, this would add redundancy in the event of a failover but it wouldn’t actually result in increased speed or throughput as special configuration is required.
To take advantage of the additional bandwidth made available by additional Uplinks, we need to specially configure multiple portgroups on the switch (vSwitch or vDS Distributed Switch), and configure each portgroup to only use one of the Uplinks as the “Active Uplink” with the rest of the uplinks under “Standby Uplink”.
vSwitch or vDS Switch
Active Uplink: Uplink 1
Standby Uplinks: Uplink 2, Uplink 3, Uplink 4
Active Uplink: Uplink 2
Standby Uplinks: Uplink 1, Uplink 3, Uplink 4
Active Uplink: Uplink 3
Standby Uplinks: Uplink 1, Uplink 2, Uplink 4
Active Uplink: Uplink 4
Standby Uplinks: Uplink 1, Uplink 2, Uplink 3
You would then place a single or multiple vmk adapters on each of the portgroups per host, which would result in essentially mapping the vmk(s) to the specific uplink. This will allow you to utilize multiple NICs for vMotion.
And remember, you may not be able to fully saturate a NIC interface (as stated above) with a single vmk adapter, so I highly recommend creating multiple vmk adapters on each of the Port groups above to make sure that you’re not only using multiple NICs, but that you can also fully saturate each of the NICs.
When performing a VMware vMotion on a Virtual Machine with an NVIDIA vGPU attached to it, the VM may freeze during migration. Additionally, when performing a vMotion on a VM without a vGPU, the VM does not freeze during migration.
So why is it that adding a vGPU to a VM causes it to become frozen during vMotion? This is referred to as the VM Stun Time.
I’m going to explain why this happens, and what you can do to reduce these STUN times.
First, let’s start with traditional vMotion without a vGPU attached.
vMotion allows us to live migrate a Virtual Machine instance from one ESXi host, to another, with (visibly) no downtime. You’ll notice that I put “visibly” in brackets…
When performing a vMotion, vSphere will migrate the VM’s memory from the source to destination host and create checkpoints. It will then continue to copy memory deltas including changes blocks after the initial copy.
Essentially vMotion copies the memory of the instance, then initiates more copies to copy over the changes after the original transfer was completed, until the point where it’s all copied and the instance is now running on the destination host.
VMware vMotion with vGPU
For some time, we have had the ability to perform a vMotion with a VM that as a GPU attached to it.
However, in this situation things work slightly different. When performing a vMotion, it’s not only the system RAM memory that needs to be transferred, but the GPU’s memory (VRAM) as well.
Unfortunately the checkpoint/delta transfer technology that’s used with then system RAM isn’t available to transfer the GPU, which means that the VM has to be stunned (frozen) to stop it so that the video RAM can be transferred and then the instance can be initialized on the destination host.
The STUN time is essentially the time it takes to transfer the video RAM (framebuffer) from one host to another.
However, it will always vary depending on a number of factors. These factors include:
vMotion Network Speed
vMotion Network Optimization
Multi-NIC vMotion to utilize multiple NICs
Multi-vmk vMotion to optimize and saturate single NICs
The number of VM’s that are currently being migrated with vMotion
As you can see, there’s a number of things that play in to this. If you have a single 10Gig link for vMotion and you’re migrating many VMs with a vGPU, it’s obviously going to take longer than if you were just migrating a single VM with a vGPU.
Optimizing and Minimizing vGPU STUN Time
There’s a number of things we can look at to minimize the vGPU STUN times. This includes:
Upgrading networking throughput with faster NICs
Optimizing vMotion (Configure multiple vMotion VMK adapters to saturate a NIC)
Microsoft Azure Active Directory has two different methods for handling SSO (Single Sign On), these include SSO via a Primary Refresh Token (PRT) and Azure Seamless SSO. In this post, I’ll explain the differences, and when to use which one.
What does Azure AD SSO do?
Azure AD SSO allows your domain joined Windows workstations (and Windows Servers) to have a Single Sign On experience so that users can have an single sign-on integrated experience when accessing Microsoft 365 and/or Office 365.
When Azure AD SSO is enabled and functioning, your users will not be prompted nor have to log on to Microsoft 365 or Office 365 applications or services (including web services) as all this will be handled transparently in the background with Azure AD SSO.
For VDI environments, especially non-persistent VDI (VMware Instant Clones), this is an important function so that users are not prompted to login every time they launch an Office 365 application.
Persistent VDI is not complex and doesn’t have any special considerations for Azure AD SSO, as it will function the same way as traditional workstations, however non-persistent VDI requires special planning.
Please Note: Organizations often associate the Office 365 login prompts to activation issues when in fact activation is functioning fine, however Azure AD SSO is either not enabled, incorrect configured, or not functioning which is why the users are being prompted for login credentials every time they establish a new session with non-persistent VDI. After reading this guide, it should allow you to resolve the issue of Office 365 login prompts on VDI non-persistent and Instant Clone VMs.
Azure AD SSO methods
There are two different ways to perform Azure AD SSO in an environment that is not using ADFS. These are:
Azure AD SSO via Primary Refresh Token
Azure AD Seamless SSO
Both accomplish the same task, but were created at different times, have different purposes, and are used for different scenarios. We’ll explore this below so that you can understand how each works.
Fun fact: You can have both Azure AD SSO via PRT and Azure AD Seamless SSO configured at the same time to service your Active Directory domain, devices, and users.
Azure SSO via Primary Refresh Token
When using Azure SSO via Primary Refresh Token, SSO requests are performed by Windows Workstations (or Windows Servers), that are Hybrid Azure AD Joined. When a device is Hybrid Azure AD Joined, it is joined both to your on-premise Active Directory domain, as well registered to your Azure Active Directory.
Azure SSO via Primary Refresh token requires the Windows instance to be running Windows 10 (or later), and/or Windows Server 2016 (or later), as well the Windows instance has to be Azure Hybrid AD joined. If you meet these requirements, SSO with PRT will be performed transparently in the background.
If you require your non-persistent VDI VMs to be Hybrid Azure AD joined and require Azure AD SSO with PRT, special considerations and steps are required:
Scripts to automatically unjoin non-persistent (Instant Clone) VDI VMs from Azure AD on logoff.
Scripts to cleanup old entries on Azure AD
If you properly deploy this, it should function. If you don’t require your non-persistent VDI VMs to be Hybrid Azure AD joined, then Azure AD Seamless SSO may be better for your environment.
Azure AD Seamless SSO
Microsoft Azure AD Seamless SSO after configured and implemented, handles Azure AD SSO requests without the requirement of the device being Hybrid Azure AD joined.
Seamless SSO works on Windows instances instances running Windows 7 (or later, including Windows 10 and Windows 11), and does NOT require the the device to be Hybrid joined.
Seamless SSO allows your Windows instances to access Azure related services (such as Microsoft 365 and Office 365) and provides a single sign-on experience.
This may be the easier method to use when deploying non-persistent VDI (VMware Instant Clones), if you want to implement SSO with Azure, but do not have the requirement of Hybrid AD joining your devices.
Additionally, by using Seamless SSO, you do not need to implement the require log-off and maintenance scripts mentioned in the above section (for Azure AD SSO via PRT).
To use Azure AD Seamless SSO with non-persistent VDI, you must configure and implement Seamless SSO, as well as perform one of the following to make sure your devices do not attempt to Hybrid AD join:
Exclude the non-persistent VDI computer OU containers from Azure AD Connect synchronization to Azure AD
Implement a registry key on your non-persistent (Instant Clone) golden image, to disable Hybrid Azure AD joining.
To disable Hybrid Azure AD join on Windows, create the registry key on your Windows image below:
Different methods can be used to implement SSO with Active Directory and Azure AD as stated above. Use the method that will be the easiest to maintain and provide support for the applications and services you need to access. And remember, you can also implement and use both methods in your environment!
After configuring Azure AD SSO, you’ll still be required to implement the relevant GPOs to configure Microsoft 365 and Office 365 behavior in your environment.
Please see below for additional information and resources:
Privacy & Cookies Policy