Jan 112024
 
ESXi-Host-Decommission

While most of us frequently deploy new ESXi hosts, a question and task not oftenly discussed is how to properly decommission a VMware ESXi host.

Some might be surprised to learn that you cannot simply just power down and remove the host from the vCenter server, as there are a number of steps that must be taken beforehand to ensure a proper successful decommission. Properly decommissioning the ESXi host avoids orphaned objects in the vCenter database, which can sometimes cause problems in the future.

Today we’ll go over how to properly decommission a VMware ESXi host in an environment with VMware vCenter Server.

The Process – How to decommission ESXi

We will detail the process and considerations to decommission an ESXi host. We will assume that you have since migrated all your VMs, templates, and files from the host, and it contains no data that requires backup or migration.

VMware ESXi Host Decommission Procedures
VMware ESXi Host Decommission Procedures

Process in Short:

  1. Enter Maintenance Mode
  2. Remove Host from vDS Switches
  3. Unmount and Detach iSCSI LUNs
  4. Move host from cluster to datacenter as standalone host
  5. Remove Host from Inventory

Please read further for extended procedures and more information.

Enter Maintenance Mode

We enter maintenance mode to confirm that no VMs are running on the host. You can simply right click the host, and enter maintenance mode.

Remove Host from vDS Switches

You must gracefully remove the host from any vDS switches (VMware Distributed Switches) before removing the host from vCenter Server.

You can create a standard vSwitch and migrate vmk (VMware Kernel) adapters from the vDS switch to standard vSwitch, to maintain communication with the vCenter server and other networks.

Please Note: If you are using vDS switches for iSCSI connectivity, you must gracefully develop a plan to deal with this beforehand, either by unmounting/detaching the iSCSI LUNs on the vDS before removing the switch, or gracefully migrating the vmk adapters to a standard vSwitch, using MPIO to avoid losing connectivity during the process.

Unmount and Detach iSCSI LUNs

You can now proceed to unmount and detach iSCSI LUNs from the selected system:

  1. Unmount the iSCSI LUN(s) from the host
  2. Detach the iSCSI LUN(s) from the host

You will unmount only on the selected host to be decommissioned, and then detach the LUNs (again only on the host you are decommissioning).

Move Host from Cluster to Datacenter as standalone host

While this may not be required, I usually do this to let vSphere Cluster Services (HA/DRS) adjust for the host removal, and also deal with reconfiguration of the HA agent on the ESXi Host. You can simply move the host from the cluster to the parent datacenter level.

Remove Host from Inventory

Once the host has been moved and a moment or two have elapsed, you can now proceed to remove the host from inventory.

While the host is powered on and still connected to vCenter, right click on the host and choose “Remove from Inventory”. This will gracefully remove objects from vCenter, and also uninstall the HA agent from the ESXi host.

Host Repurposing

At this point, you can now log directly on to the ESXi host using the local root password, and shutdown the host.

Jan 072024
 
VMware Horizon View Logo

This guide will outline the instructions to Disable the VMware Horizon Session Bar. These instructions can be used to disable the Horizon Session Bar (also known as the Horizon Client Menu Bar or Shade Bar) for full screen Horizon VDI sessions.

Horizon Client Menu Bar (Shade)

The Horizon Client Menu Bar, or “Shade”, is the Session bar at the top of full screen VMware Horizon VDI Sessions.

This Menu Bar provides information on the connection, ability to send key sequences, connect USB devices, restart a VDI guest VM and more.

In same cases, users or administrators may want to disable the Shade.

Disable the Horizon Client Menu Bar (Shade)

There are multiple ways that you can disable the shade including using GPOs as well as the registry on client systems. Please note that if you are setting up clients in Kiosk mode, the shade will be automatically disabled and these instructions aren’t required.

Disable Horizon Shade using GPO

To disable the Shade with GPOs, create a Group Policy Object (or edit the local group policy on the client system), and navigate to the following location:

User Configuration -> Policies -> Administrative Templates -> VMware Horizon Client Configuration

Here, we will set Enable the shade to “Disabled”, as show below:

Disable VMware Horizon Shade using GPO to set “Enable the Shade” to Disabled

Disable Horizon Shade using Registry

To disable the Shade using registry on the client system, navigate to the following registry key:

HKEY_LOCAL_MACHINE\Software\VMware, Inc.\VMware VDM\Client\

Here, we can create a String (REG_SZ) value called EnableShade and set it to False which will disable the Shade.

Additional Information

Jan 062024
 
vMotion with vGPU

Normally, any VMs that are NVIDIA vGPU enabled have to be manually migrated with manual vMotion if a host is placed in to maintenance mode, to evacuate the host. While we may have grown accustomed to this, there is a better way, with vGPU Enabled VM DRS Evacuation during Maintenance mode!

A new feature that was introduced with vSphere 7.0 U3f, was the ability to configure and allow automatic vMotion of VMs with vGPUs, meaning that DRS can now migrate your VDI and AI/ML vGPU enabled workloads when hosts are placed in to maintenance mode. This also allows you to streamline remediation with vLCM when updating vGPU enabled hosts running vGPU enabled VMs.

Additionally, as of vSphere 8.0 U2, DRS can now estimate the STUN times required for vMotion of vGPU enabled VMs, and control whether automatic DRS vMotion’s are allowed. This STUN time limit can be set buy an administrator.

Enable automatic vMotion evacuation of vGPU enabled VMs

To enable the automatic vMotion of vGPU enabled VMs on your vSphere Cluster:

  1. Navigate to your vSphere Cluster.vSphere Cluster Selected
  2. Click on the “Configure” Tab, and then select “vSphere DRS”, and click “Edit”.vSphere DRS Cluster - DRS Advanced Settings
  3. Navigate to the “Advanced Options” tab.
  4. Add “VgpuMMAutomationTimeoutSecs” and set to “-1”.vSphere DRS set VgpuMMAutomationTimeoutSecs

After performing the above, when you place a host with vGPU enabled Virtual Machines in to Maintenance Mode, vSphere DRS will evacuate and migrate the VMs to other hosts in the cluster that have the required hardware.

If you attempt to place a host in to Maintenance Mode without enabling automatic vMotion of vGPU enabled VMs, it will fail with the error: “DRS failed to generate a vMotion recommendation for a virtual machine on a host entering Maintenance Mode“.

Enable and Configure vGPU STUN Time Estimate and Limits

If you are running vSphere 8U2 or higher, you can enable vGPU STUN time estimation and limits for DRS on the vGPU enabled cluster. Similar to the instructions above, we can add and configure two variables to the vSphere DRS cluster “Advanced Options”.

To enable STUN time estimation, add PassthroughDrsAutomation and set to “1”.

To override the default vMotion STUN time limit of 100 seconds, add VmDevicesStunTimeTolerated and set it to your preferred maximum number of seconds. Alternatively, you can set this limit Per VM by navigating to the VM in vSphere and adding this variable under the “VM Options” “Advanced Settings” section.

Additional Documentation

Jan 052024
 
NVIDIA vGPU Installed in VMware ESXi Host

You may experience GPU issues with the VMware Horizon Indirect Display Driver in your environment when using 3rd party applications which incorrectly utilize the incorrect display adapter. This results with the inability to use and/or run GPU accelerated workloads including VDI, AI, and ML.

This issue effects NVIDIA vGPU (both vGPU and vDGA passthrough), AMD MxGPU, and Intel Data Center GPU Flex GPUs using SR-IOV, in any deployment where the VMware Indirect Display Driver is installed.

When this issue occurs, the application will incorrectly query the capabilities of the VMware Indirect Display Adapter instead of the GPU that is presented to the VM, resulting in a scenario where the application isn’t aware of the capabilities of the GPU you are utilizing, failing to utilize the GPU, and hardware acceleration, such as hardware encoding (NVENC) and hardware decoding.

What is the VMware Horizon Indirect Display Driver

The VMware Horizon Indirect Display Driver, also known as the VMware Indirect Display Driver, is a “virtual” display driver that isn’t bound to a specific hypervisor, and works with many deployments because of the lack of that limitation.

GPU Issues with the VMware Horizon Indirect Display Driver Enabled

This driver is installed with the VMware Horizon agent, and can work in conjunction with hardware acceleration, including GPUs (such as NVIDIA vGPU, AMD MxGPU, and Intel Data Center GPUs using SR-IOV).

Under normal circumstances, the VMware Horizon Indirect Display Driver is prioritized as a fallback driver for remoting protocols, except in environments where no hypervisor or GPU display drivers are available (like Horizon Cloud on Azure) in which case it would become the priority.

The Problem

Applications designed to use a GPU, may not be able to correctly identify which display adapter to use on the VM. While you may have a GPU, vGPU, or 3D acceleration in your environment, the application may be unaware of the device and/or its capabilities.

This is caused by the application either not correctly using the preferred primary display adapter (GPU and/or vGPU), or not being designed to handle multiple display adapters (and drivers).

Example Scenario:

When using CyberLink PowerDirector 360 in a VMware Horizon environment with an NVIDIA vGPU, the application will query the VM’s Windows instance for hardware acceleration capabilities, specifically hardware encoding, hardware decoding, and use of APIs like NVIDIA’s NVENC encoder. In this scenario, while the VM does have an NVIDIA vGPU workstation profile attached with a valid NVIDIA RTX Virtual Workstation (vWS) license, the application is only aware of the VMware Indirect Display Driver and it’s capabilities. This results in all hardware accelerated encoding and decoding capabilities to be disabled.

Example Symptoms

  • 3D Acceleration not detected by application
  • CUDA Cores not available for application
  • OpenCL not available
  • DirectX and Direct3D usage unavailable

In all scenarios, the VM will appear to have 3D acceleration, however one or multiple applications won’t have access.

The Solution

Thanks to the design of the VMware Indirect Display Driver, it should be prioritized in a fashion that it’s used only when other display drivers aren’t available (including NVIDIA vGPU), or system resources aren’t available; however, some 3rd party application may not be able to reference the prioritization, or support multi-GPU (multi display driver), resulting in the incorrect display adapter being used.

As a workaround, you can remove the VMware Indirect Display Driver from the Windows instance running in the VM.

NVIDIA vGPU with VMware Horizon Indirect Display Driver Removed

Please note that simply disabling the “VMware Horizon Indirect Display Driver” will not suffice. A full removal (Right Click, “Uninstall Device”) is required to workaround this issue. Additionally, upgrading or re-installing the VMware Horizon Agent will re-install the VMware Indirect Display Driver.

Dec 082023
 
vCenter-Root-CA-Missing

Today we’ll go over how to install the vSphere vCenter Root Certificate on your client system.

Certificates are designed to verify the identity of the systems, software, and/or resources we are accessing. If we aren’t able to verify and authenticate what we are accessing, how do we know that the resource we are sending information to, is really who they are?

Installing the vSphere vCenter Root Certificate on your client system, allows you to verify the identity of your VMware vCenter server, VMware ESXi hosts, and other resources, all while getting rid of those pesky certificate errors.

Certificate warning when connecting to vCenter vCSA
Certificate warning when connecting to vCenter vCSA

I see too many VMware vSphere administrators simply dismiss the certificate warnings, when instead they (and you) should be installing the Root CA on your system.

Why install the vCenter Server Root CA

Installing the vCenter Server’s Root CA, allows your computer to trust, verify, and validate any certificates issued by the vSphere Root Certification authority running on your vCenter appliance (vCSA). Essentially this translates to the following:

  • Your system will trust the Root CA and all certificates issued by the Root CA
    • This includes: VMware vCenter, vCSA VAMI, and ESXi hosts
  • When connecting to your vCenter server or ESXi hosts, you will not be presented with certificate issues
  • You will no longer have vCenter OVF Import and Datastore File Access Issues
    • This includes errors when deploying OVF templates
    • This includes errors when uploading files directly to a datastore
File Upload in vCenter to ESXi host operation failed

In addition to all of the above, you will start to take advantage of certificate based validation. Your system will verify and validate that when you connect to your vCenter or ESXi hosts, that you are indeed actually connecting to the intended system. When things are working, you won’t be prompted with a notification of certificate errors, whereas if something is wrong, you will be notifying of a possible security event.

How to install the vCenter Root CA

To install the vCenter Root CA on your system, perform the following:

  1. Navigate to your VMware vCenter “Getting Started” page.
    • This is the IP or FQDN of your vCenter server without the “ui” after the address. We only want to access the base domain.
    • Do not click on “Launch vSphere Client”.
  2. Right click on “Download trusted root CA certificates”, and click on save link as.
    Link to download vCenter trusted root CA Certificates
  3. Save this ZIP file to your computer, and extract the archive file
    • You must extract the ZIP file, do not open it by double-clicking on the ZIP file.
  4. Open and navigate through the extracted folders (certs/win in my case) and locate the certificates.
    VMware vCenter Root Certificates
  5. For each file that has the type of “Security Certificate”, right click on it and choose “Install Certificate”.
  6. Change “Store Location” to “Local Machine”
    • This makes your system trust the certificate, not just your user profile
  7. Choose “Place all certificates in the following store”, click Browse, and select “Trusted Root Certification Authorities”.
    Screenshot to Place in Trusted Root Certification Authorities
  8. Complete the wizard. If successful, you’ll see: “The import was successful.”.
  9. Repeat this for each file in that folder with the type of “Security Certificate”.

Alternatively, you can use a GPO with Active Directory or other workstation management techniques to deploy the Root CAs to multiple systems or all the systems in your domain.

Oct 072023
 
Installing VDI optimized New Teams client application on Windows VDI

In this guide we will deploy and install the new Microsoft Teams for VDI (Virtual Desktop Infrastructure) client, and enable Microsoft Teams Media Optimization on VMware Horizon.

This guide replaces and supersedes my old guide “Microsoft (Classic) Teams VDI Optimization for VMware Horizon” which covered the old Classic Teams client and VDI optimizations. The new Microsoft Teams app requires the same special considerations on VDI, and requires special installation instructions to function VMware Horizon and other VDI environments.

You can run the old and new Teams applications side by side in your environment as you transition users.

New Teams client with toggle for old version running on VMware Horizon VDI with optimization
Switch between New Teams and old Teams on VDI

Let’s cover what the new Microsoft Teams app is about, and how to install it in your VDI deployment.

Please note: VDI (Virtual Desktop Infrastructure) support for the new Teams client went G.A. (Generally Availabile) on December 05, 2023. Additionally, Classic teams will go end of support on June 30, 2024.

Table of Contents

Please see below for a table of contents:

The New Microsoft Teams App

On October 05, 2023, Microsoft announced the availability of the new Microsoft Teams application for Windows and Mac computers. This application is a complete rebuild from the old client, and provides numerous enhancements with performance, resource utilization, and memory management.

New Microsoft Teams app VDI optimized with Toggle for new/old version

Ultimately, it’s way faster, and consumes way less memory. And fortunately for us, it supports media optimizations for VDI environments.

My close friend and colleague, mobile jon, did a fantastic in-depth Deep Dive into the New Microsoft Teams and it’s inner workings that I highly recommend reading.

Interestingly enough, it uses the same media optimization channels for VDI as the old client used, so enablement and/or migrating from the old version is very simple if you’re running VMware Horizon, Citrix, AVD, and/or Windows 365.

Install New Microsoft Teams for VDI

While installing the new Teams is fairly simple for non-VDI environment (by simply either enabling the new version in the Teams Admin portal, or using your application manager to deploy the installer), a special method is required to deploy on your VDI images, whether persistent or non-persistent.

Do not include and bundle the Microsoft Teams install with your Microsoft 365 (Office 365) deployment as these need to be installed separately.

Please Note: If you have deployed non-persistent VDI (Instant Clones), you’ll want to make sure you disable auto-updates, as these should be performed manually on the base image. For persistent VDI, you will want auto updates enabled. See below for more information on configurating auto-updates.

You will also need to enable Microsoft Teams Media Optimization for the VDI platform you are using (in my case and example, VMware Horizon).

Considerations for New Teams on VDI

  • Auto-updates can be disabled via a registry key
  • New Teams client app uses the same VDI media optimization channels as the old teams (for VMware Horizon, Citrix, AVD, and W365)
    • If you have already enabled Media Optimization for Teams on VDI for the old version, you can simply install the client using the special bulk installer for all users as shown below, as the new client uses the existing media optimizations.
  • While it is recommended to uninstall the old client and install the new client, you can choose to run both versions side by side together, providing an option to your users as to which version they would like to use.

Enable Media Optimization for Microsoft Teams on VDI

If you haven’t previously for the old client, you’ll need to enable the Teams Media Optimizations for VDI for your VDI platform.

For VMware Horizon, we’ll create a GPO and set the “Enable HTML5 Features” and “Enable Media Optimization for Microsoft Teams”, to “Enabled”. If you have done this for the old Teams app, you can skip this.

Please see below for the GPO setting locations:

Computer Configuration -> Policies -> Administrative Templates -> VMware View Agent Configuration -> VMware HTML5 Features -> Enable VMware HTML5 Features
Computer Configuration -> Policies -> Administrative Templates -> VMware View Agent Configuration -> VMware HTML5 Features -> VMware WebRTC Redirection Features -> Enable Media Optimization for Microsoft Teams

When installing the VMware Horizon client on Windows computers, you’ll need to make sure you check and enable the “Media Optimization for Microsoft Teams” option on the installer if prompted. Your install may automatically include Teams Optimization and not prompt.

Screenshot of VMware View Client Install with Microsoft Teams Optimization
VMware Horizon Client Install with Media Optimization for Microsoft Teams

If you are using a thin client or zero client, you’ll need to make sure you have the required firmware version installed, and any applicable vendor plugins installed and/or configurables enabled.

Install New Microsoft Teams client on VDI

At this time, we will now install the new Teams app on to both non-persistent images, and persistent VDI VM guests. This method performs a live download and provisions as Administrator. If running this un-elevated, an elevation prompt will appear:

  1. Download the new Microsoft Teams Bootstrapper: https://go.microsoft.com/fwlink/?linkid=2243204&clcid=0x409
  2. On your persistent or non-persistent VM, run the following command as an administrator: teamsbootstrapper.exe -p
  3. Restart the VM (and/or seal your image for deployment)
Installing
Install the new Teams for VDI (Virtual Desktop Infrastructure) with teamsbootstrapper.exe

See below for an example of the deployment:

C:\Users\Administrator.DOMAIN\Downloads>teamsbootstrapper.exe -p
{
  "success": true
}

You’ll note that running the command returns success equals true, and Teams is now installed for all users on this machine.

Install New Microsoft Teams client on VDI (Offline Installer using MSIX package)

Additionally, you can perform an offline installation by also downloading the MSI-X packages and running the following command:

teamsbootstrapper.exe -p -o "C:\LOCATION\MSTeams-x64.msix"
New Teams admin provisioned offline install for VDI
New Teams admin provisioned offline install for VDI

For the offline installation, you’ll need to download the appropriate MSI-X file in additional to the bootstrapper above. See below for download links:

Disable New Microsoft Teams Client Auto Updates

For non-persistent environments, you’ll want to disable the auto update feature and install updates manually on your base image.

To disable auto-updates for the new Teams client, configure the registry key below on your base image:

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Teams

Create a DWORD value called “disableAutoUpdate”, and set to value of “1”.

New Teams app disappears after Optimization with OSOT

If you are using the VMware Operating System Optimization Tool (OSOT), you may notice that after installing New Teams in your base or golden image, that it disappears when publishing and pushing the image to your desktop pool.

The New Teams application is a Windows Store app, and organizations commonly choose to remove all Windows Store apps inside the golden image using the OSOT tool when optimizing the image. Doing this will remove New Teams from your image.

To workaround this issue, you’ll need to choose “Keep all Windows Store Applications” in the OSOT common options, which won’t remove Teams.

Using New Microsoft Teams with FSLogix Profile Containers

When using the new Teams client with FSLogix Profile Containers on non-persistent VDI, you must upgrade to FSLogix version 2.9.8716.30241 to support the new teams client.

Confirm New Microsoft Teams VDI Optimization is working

To confirm that VDI Optimization is working on New Teams, open New Teams, click the “…” in the top right next to your user icon, click “Settings”, then click on “About Teams” on the far bottom of the Settings menu.

New Teams showing “VMware Media Optimized”

You’ll notice “VMware Media Optimized” which indicates VDI Optimization for VMware Horizon is functioning. The text will reflect for other platforms as well.

Uninstall New Microsoft Teams on VDI

The Teams Boot Strap utility can also remove teams for all users on this machine as well by using the “-x” flag. Please see below for all the options for “teamsbootstrapper.exe”:

C:\Users\Administrator.DOMAIN\Downloads>teamsbootstrapper.exe --help
Provisioning program for Microsoft Teams.

Usage: teamsbootstrapper.exe [OPTIONS]

Options:
  -p, --provision-admin    Provision Teams for all users on this machine.
  -x, --deprovision-admin  Remove Teams for all users on this machine.
  -h, --help               Print help

Install New Microsoft Teams on VMware App Volumes / Citrix App Layering

Using the New Teams bootstrapper, it appears that it evades and doesn’t work with App Packaging and App attaching technologies such as VMware App Volumes and Citrix Application layering.

The New Teams bootstrapper downloads and installs an MSIX app package to the computer running the bootstrapper.

To deploy and install new Teams on VMware App Volumes or Citrix App Layering (or other app technologies), you’ll most likely need to download and import the MSIX package in to the application manager and deploy using that.

Conclusion

It’s great news that we finally have a better performing Microsoft Teams client that supports VDI optimizations. With new Teams support for VDI reaching GA, and with the extensive testing I’ve performed in my own environment, I’d highly recommend switching over at your convenience!

Jul 282023
 
NVIDIA GPU Manager

In May of 2023, NVIDIA released the NVIDIA GPU Manager for VMware vCenter. This appliance allows you to manage your NVIDIA vGPU Drivers for your VMware vSphere environment.

Since the release, I’ve had a chance to deploy it, test it, and use it, and want to share my findings.

In this post, I’ll cover the following (click to skip ahead):

  1. What is the NVIDIA GPU Manager for VMware vCenter
  2. How to deploy and configure the NVIDIA GPU Manager for VMware vCenter
    • Deployment of OVA
    • Configuration of Appliance
  3. Using the NVIDIA GPU Manager to manage, update, and deploy vGPU drivers to ESXi hosts

Let’s get to it!

What is the NVIDIA GPU Manager for VMware vCenter

The NVIDIA GPU Manager is an (OVA) appliance that you can deploy in your VMware vSphere infrastructure (using vCenter and ESXi) to act as a driver (and update) repository for vLCM (vSphere Lifecycle Manager).

In addition to acting as a repo for vLCM, it also installs a plugin on your vCenter that provides a GUI for browsing, selecting, and downloading NVIDIA vGPU host drivers to the local repo running on the appliance. These updates can then be deployed using LCM to your hosts.

In short, this allows you to easily select, download, and deploy specific NVIDIA vGPU drivers to your ESXi hosts using vLCM baselines or images, simplifying the entire process.

Supported vSphere Versions

The NVIDIA GPU Manager supports the following vSphere releases (vCenter and ESXi):

  • VMware vSphere 8.0 (and later)
  • VMware vSphere 7.0U2 (and later)

The NVIDIA GPU Manager supports vGPU driver releases 15.1 and later, including the new vGPU 16 release version.

How to deploy and configure the NVIDIA GPU Manager for VMware vCenter

To deploy the NVIDIA GPU Manager Appliance, we have to download an OVA (from NVIDIA’s website), then deploy and configure it.

See below for the step by step instructions:

Download the NVIDIA GPU Manager

  1. Log on to the NVIDIA Application Hub, and navigate to the “NVIDIA Licensing Portal” (https://nvid.nvidia.com).
  2. Navigate to “Software Downloads” and select “Non-Driver Downloads”
  3. Change Filter to “VMware vCenter” (there is both VMware vSphere, and VMware vCenter, pay attention to select the correct).
  4. To the right of “NVIDIA GPU Manager Plug-in 1.0.0 for VMware vCenter”, click “Download” (see below screenshot).
Screenshot of download link for NVIDIA GPU Manager for VMware vCenter
NVIDIA GPU Manager Download Page

After downloading the package and extracting, you should be left with the OVA, along with Release Notes, and the User Guide. I highly recommend reviewing the documentation at your leisure.

Deploy and Configure the NVIDIA GPU Manager

We will now deploy the NVIDIA GPU Manager OVA appliance:

  1. Deploy the OVA to either a cluster with DRS, or a specific ESXi host. In vCenter either right click a cluster or host, and select “Deploy OVF Template”. Choose the GPU Manager OVA file, and continue with the wizard.NVIDIA GPU Manager OVA Deploy
  2. Configure Networking for the Appliance
    • You’ll need to assign an IP Address, and relevant networking information.
    • I always recommend creating DNS (forward and reverse entries) for the IP.NVIDIA GPU Manager OVA Network Configuration
  3. Finally, power on Appliance.

We must now create a role and service account that the GPU Manager will use to connect to the vCenter server.

While the vCenter Administrator account will work, I highly recommend creating a service account specifically for the GPU Manager that only has the required permissions that are necessary for it to function.

  1. Log on to your vCenter Server
  2. Click on the hamburger menu item on the top left, and open “Administration”.
  3. Under “Access Control” select Roles. vCenter-Roles
  4. Select New to create a new role. We can call it “NVIDIA Update Services”.
  5. Assign the following permissions:
    • Extension Privileges
      • Register Extension
      • Unregister Extension
      • Update Extension
    • VMware vSphere Lifecycle Manager Configuration Priveleges
      • Configure Service
    • VMware vSphere Lifecycle Manager Settings Priveleges
      • Read
    • Certificate Management Privileges
      • Create/Delete (Admins priv)
      • Create/Delete (below Admins priv)
    • ***PLEASE NOTE: The above permissions were provided in the documentation and did not work for me (resulted in an insufficient privileges error). To resolve this, I chose “Select All” for “VMware vSphere Lifecycle Manager”, which resolved the issue.***
  6. Save the Role
  7. On the left hand side, navigate to “Users and Groups” under “Single Sign On”
  8. Change the domain to your local vSphere SSO domain (vsphere.local by default)
  9. Create a new user account for the NVIDIA appliance, as an example you could use “nvidia-svc”, and choose a secure password.
  10. Navigate to “Global Permissions” on the left hand side, and click “Add” to create a new permission.
  11. Set the domain, and choose the new “nvidia-svc” service account we created, and set the role to “NVIDIA Update Services”, and check “Propagate to Children”.
  12. You have now configured the service account.

Now, we will perform the initial configuration of the appliance. To configure the application, we must do the following:

  1. Access the appliance using your browser and the IP you configured above (or FQDN) GPU Manager Account Creation
  2. Create a new password for the administrative “vcp_admin” account. This account will be used to manage the appliance.
    • A secret key will be generated that will allow the password to be reset, if required. Save this key somewhere safe.
  3. We must now register the appliance (and plugin) with our vCenter Server. Click on “REGISTER”. NVIDIA GPU Manager Register
  4. Enter the FQDN or IP of your vCenter server, the NVIDIA Service account (“nvidia-svc” from example), and password.
  5. Once the GPU Manager is registered with your vCenter server, the remainder of the configuration will be completed from the vCenter GPU.
    • The registration process will install the GPU Manager Plugin in to VMware vCenter
    • The registration process will also configure a repository in LCM (this repo is being hosted on the GPU manager appliance).

We must now configure an API key on the NVIDIA Licensing portal, to allow your GPU Manager to download updates on your behalf.

  1. Open your browser and navigate to https://nvid.nvidia.com. Then select “NVIDIA LICENSING PORTAL”. Login using your credentials.
  2. On the left hand side, select “API Keys”.
  3. On the upper right hand, select “CREATE API KEY”.
  4. Give the key a name, and for access type choose “Software Downloads”. I would recommend extending the key validation time, or disabling key expiration. NVIDIA Download API Create Key
  5. The key should now be created.
  6. Click on “view api key”, and record the key. You’ll need to enter this in later in to the vCenter GPU Manager plugin.

And now we can finally log on to the vCenter interface, and perform the final configuration for the appliance.

  1. Log on to the vCenter client, click on the hamburger menu, and select “NVIDIA GPU Manager”.
  2. Enter the API key you created above in to the “NVIDIA Licensing Portal API Key” field, and select “Apply”.
  3. The appliance should now be fully configured and activated. GPU Manager Activated API Key
  4. Configuration is complete.

We have now fully deployed and completed the base configuration for the NVIDIA GPU Manager.

Using the NVIDIA GPU Manager to manage, update, and deploy vGPU drivers to ESXi hosts

In this section, I’ll be providing an overview of how to use the NVIDIA GPU Manager to manage, update, and deploy vGPU drivers to ESXi hosts. But first, lets go over the workflow…

The workflow is a simple one:

  1. Using the vCenter client plugin, you choose the drivers you want to deploy. These get downloaded to the repo on the GPU Manager appliance, and are made available to Lifecycle Manager.
  2. You then use Lifecycle Manager to deploy the vGPU Host Drivers to the applicable hosts, using baselines or images.

As you can see, there’s not much to it, despite all the configuration we had to do above. While it is very simple, it simplifies management quite a bit, especially if you’re using images with Lifecycle Manager.

To choose and download the drivers, load up the plugin, use the filters to filter the list, and select your driver to download.

GPU Manager downloading vGPU Driver
NVIDIA GPU Manager downloading vGPU Driver

As you can see in the example, I chose to download the vGPU 15.3 host driver. Once completed, it’ll be made available in the repo being hosted on the appliance.

Once LCM has a changed to sync with the updated repos, the driver is then made available to be deployed. You can then deploy using baselines or host images.

LCM Image Update with NVIDIA vGPU Driver from NVIDIA GPU Manager
LCM Image Update with NVIDIA vGPU Driver from NVIDIA GPU Manager

In the example above, I added the vGPU 16 (535.54.06) host driver to my clusters update image, which I will then remediate and deploy to all the hosts in that cluster. The vGPU driver was made available from the download using GPU Manager.

Jul 252023
 

When it comes to virtualized workloads, one thing I commonly see overlooked in the design of the solution, is the placement of workloads. In this post, I want to cover VMware vSphere VM placement rules using the “VM/Host Rules” feature.

This is a feature that I commonly see overlooked and not configured, especially in smaller single cluster environments, however I’ve also seen this happen in very large scale environments as well.

Let’s cover the why, what, who, and how…

VM Workloads

While VMware vSphere does have a number of technologies built in for redundancy, load-balancing, and availability, as part of the larger solution we often find our workloads, specifically 3rd party platforms, with their own solutions that accomplish the same thing.

We need to identify which HA (High Availability) or redundancy solution to use, based on the application, service, and how it works.

For example, using VMware vSphere HA, or High Availability, if vCenter (and/or vCLS) detects a host goes offline, it can restart the workload on other online hosts. There is time associated with the detection and boot time, resulting in a loss of service during this period.

Third party solutions often have their own high availability or redundancy built in to the solution, such as Microsoft Active Directory. In this case with a standard configuration, at any time, any domain controller can respond to a clients request for resources. If one DC goes offline, other DCs can respond to the request resulting in no downtime.

Obviously, in the case of Active Directory Domain Controllers, you’d much prefer to have multiple DCs in your environment, instead of using one with vSphere HA.

Additionally, if you did have multiple domain controllers, you’d want to make sure they aren’t all placed on the same ESXi host. This is where we start to incorporate VM placement in to our solution.

VM Placement

When it comes to 3rd party solutions like mentioned above, we need to identify these workloads and factor them in to the design of the solution we are either implementing, maintaining, or improving.

Example of VM workloads used with VM Placement

A few examples of these workloads with their own load-balancing and availability technologies:

  • Microsoft Windows Active Directory Domain Controllers
  • Microsoft Windows Servers running DNS/DHCP Servers
  • Virtualized Active/Active or Active/Passive Firewall Appliances
  • VMware Horizon UAG (Unified Access Gateway) configured in HA mode
  • Other servers/services that have their own availability systems

As you can see, the applications all have their own special solution for availability, so we must insure the different “nodes” or “instances” are running on different ESXi hosts to avoid a host failure bringing down the entire solution.

Unless otherwise specified by the 3rd party vendor, I would recommend using VM/Host Rules in combination with vSphere DRS and HA.

Configuring VM Placement with VM/Host Rules

To configure these rules, follow the instructions below:

  1. Log on to your VMware vCenter Server
  2. Select a Cluster
  3. Click on the “Configure” tab, and then “VM/Host Rules”
    • Here you can Add/Edit/Delete VM Host Rules
  4. Click on “Add”, and give the rule a new name (Example: Domain Controllers)
  5. For “Type”, select “Separate Virtual Machines”
  6. Click “Add” and select your Domain Controllers and add them to the rule.
Screenshot rule creation for VM placement using VM Host Rules
Domain Controller VM Placement VM Host Rule

After you click “OK”, the rule should now be saved, and DRS will make sure these VMs are now running on separate hosts.

Below you can see another example of a configured system, separating 2 Active/Passive Firewall appliances.

VM placement and VM/Host Rules for Firewall appliances
VM/Host Rules for Firewall Appliances

As you can see, VM placement with VM/Host Rules is very easy to configure and deploy.

Additional Considerations

Note, if you implement these rules and do not have enough hosts to fullfill the requirements, the hosts may fail to be evacuated by DRS when placing in maintenance mode, or remediating with vLCM (Lifecycle Manager).

In this case, you’ll need to manually vMotion the VM’s to other hosts (to violate the rule) or shut some down.

Jul 242023
 
Picture of an DL360p Gen8 1U Rack Server with IO-PEX40152 Installed

A few months ago, you may have seen my post detailing my experience with ESXi 7.0 on HP Proliant DL360p Gen8 servers. I now have an update as I have successfully loaded ESXi 8.0 on HPE Proliant DL360p Gen8 servers, and want to share my experience.

It wasn’t as eventful as one would have expected, but I wanted to share what’s required, what works, and stability observations.

Please note, this is NOT supported and NOT recommended for production environments. Use the information at your own risk.

A special thank you goes out to William Lam and his post on Homelab considerations for vSphere 8, which provided me with the boot parameter required to allow legacy CPUs.

ESXi on the DL360p Gen8

With the release of vSphere 8.0 Update 1, and all the new features and functionality that come with the vSphere 8 release as a whole, I decided it was time to attempt to update my homelab.

In my setup, I have the following:

  • 2 x HPE Proliant DL360p Gen8 Servers
    • Dual Intel Xeon E5-2660v2 Processors in each server
    • USB and/or SD for booting ESXi
    • No other internal storage
    • NVIDIA A2 vGPU (for use with VMware Horizon)
  • External SAN iSCSI Storage

Since I have 2 servers, I decided to do a fresh install using the generic installer, and then use the HPE addon to install all the HPE addons, drivers, and software. I would perform these steps on one server at a time, continuing to the next if all went well.

I went ahead and documented the configuration of my servers beforehand, and had already had upgraded my VMware vCenter vCSA appliance from 7U3 to 8U1. Note, that you should always upgrade your vCenter Server first, and then your ESXi hosts.

To my surprise the install went very smooth (after enabling legacy CPUs in the installer) on one of the hosts, and after a few days with no stability issues, I then proceeded and upgraded the 2nd host.

I’ve been running with 100% for 25+ days without any issues.

The process – Installing ESXi 8.0

I used the following steps to install VMware vSphere ESXi 8 on my HP Proliant Gen8 Server:

  1. Download the Generic ESXi installer from VMware directly.
    1. Link: Download VMware vSphere
  2. Download the “HPE Custom Addon for ESXi 8.1”.
    1. Link: HPE Custom Addon for ESXi 8.0 U1 June 2023
    2. Other versions of the Addon are here: HPE Customized ESXi Image.
  3. Boot server with Generic ESXi installer media (CD or ISO)
    • IMPORTANT: Press “Shift + o” (Shift key, and letter “o”) to interrupt the ESXi boot loader, and add “AllowLegacyCPU=true” to the kernel boot parameters.
  4. Continue to install ESXi as normal.
    • You may see warnings about using a legacy CPU, you can ignore these.
  5. Complete initial configuration of ESXi host
  6. Mount NFS or iSCSI datastore.
  7. Copy HPE Custom Addon for ESXi zip file to datastore.
  8. Enable SSH on host (or use console).
  9. Place host in to maintenance mode.
  10. Run “esxcli software vib install -d /vmfs/volumes/datastore-name/folder-name/HPE-801.0.0.11.3.1.1-Jun2023-Addon-depot.zip” from the command line.
  11. The install will run and complete successfully.
  12. Restart your server as needed, you’ll now notice that not only were HPE drivers installed, but also agents like the Agentless management agent, and iLO integrations.

After that, everything was good to go… Here you can see version information from one of the ESXi hosts:

ESXi 8 on HPE Proliant DL360p Gen8
VMware ESXi version 8.0.1 Build 21813344 on HPE Proliant DL360p Gen8 Server

What works, and what doesn’t

I was surprised to see that everything works, including the P420i embedded RAID controller. Please note that I am not using the RAID controller, so I have not performed extensive testing on it.

HPE P420i RAID Controller with VMware vSphere ESXi 8
HPE P420i RAID Controller with VMware vSphere ESXi 8

All Hardware health information is present, and ESXi is functioning as one would expect if running a supported version on the platform.

Additional Information

Note that with vSphere 8, VMware is deprecating vLCM baselines. Your focus should be to update your ESXi instances using cluster image based update images. You can incorporate vendor add-ons and components to create a customized image for deployment.

Jul 232023
 
Azure AD SSO with Horizon

With the release of VMware Horizon 2303, VMware Horizon now supports Hybrid Azure AD Join with Azure AD Connect using Instant Clones and non-persistent VDI.

So what exactly does this mean? It means you can now use Azure SSO using PRT (Primary Refresh Token) to authenticate and access on-premise and cloud based applications and resources.

What else? It allows you to use conditional access!

What is Hybrid Azure AD Join, and why would we want to do it with Azure AD Connect?

Historically, it was a bit challenging when it came to Understanding Microsoft Azure AD SSO with VDI (click to read the post and/or see the video), and special considerations had to be made when an organization wished to implement SSO between their on-prem non-persistent VDI deployment and Azure AD.

Screenshot of a Hybrid Azure AD Joined login
Hybrid Azure AD Joined Login

Azure AD SSO, the old way

The old way to accomplish this was to either implement Azure AD with ADFS, or use Seamless SSO. ADFS was bulky and annoying to manage, and Seamless SSO was actually intended to enable SSO on “downlevel devices” (older operating systems before Windows 10).

For customers without ADFS, I would always recommend using Seamless SSO to enable SSO on non-persistent VDI Instant Clones, until now!

Azure AD SSO, the new way with Azure AD Connect and Azure SSO PRTs

According to the release notes for VMware Horizon 2303:

Hybrid Azure Active Directory for SSO is now supported on instant clone desktop pools. See KB 89127 for details.

This means we can now enable and use Azure SSO with PRTs (Primary Refresh Tokens) using Azure AD Connect and non-persistent VDI Instant Clones.

Azure SSO with PRT and Non-Persistent VDI

This is actually a huge deal because not only does it allow us to use the preferred method for performing SSO with Azure, but it also allows us to start using fancy Azure features like conditional access!

Requirements for Hybrid Azure AD Join with non-persistent VDI and Azure AD Connect

In order to utilize Hybrid Join and PRTs with non-persistent VDI on Horizon, you’ll need the following:

  • VMware Horizon 2303 (or later)
  • Active Directory
  • Azure AD Connect (Implemented, Configured, and Functioning)
    • Azure AD Hybrid Domain Join must be enabled
    • OU and Object filtering must include the non-persistent computer objects and computer accounts
  • Create a VMware Horizon Non-Persistent Desktop Pool for Instant Clones
    • “Allow Reuse of Existing Computer Accounts” must be checked

When you configure this, you’ll notice that after provisioning a desktop pool (or pushing a new snapshot), that there may be a delay for PRTs to be issued. This is expected, however the PRT will be issued eventually, and subsequent desktops shouldn’t experience issues unless you have a limited number available.

*Please note: VMware still notes that ADFS is the preferred way for fast issuance of the PRT.

While VMware does recommend ADFS for performance when issuing PRTs, in my own testing I had no problems or complaints, however when deploying this in production I’d recommend that because of the PRT delay after deploying the pool or a new snapshot, to do this after hours or SSO will not function for some users who immediately get a new desktop.

Additional Considerations

Please note the following:

  • When switching from ADFS to Azure AD Connect, the sign-in process may change for users.
    • You must prepare the users for the change.
  • When using locally stored identifies and/or cached credentials, enabling Azure SSO may change the login process, or cause issues for users signing in.
    • You may have to delete saved credentials in the users persistent profile
    • You may have to adjust GPOs to account for Azure SSO
    • You may have to modify settings in your profile persistent solution
      • Example: “RoamIdentity” on FSLogix
  • I recommend testing before implementing
    • Test Environment
    • Test with new/blank user profiles
    • Test with existing users

If you’re coming from an environment that was previously using Seamless SSO for non-persistent VDI, you can create new test desktop pools that use newly created Active Directory OU containers and adjust the OU filtering appropriately to include the test OUs for synchronization to Azure AD with Azure AD Connect. This way you’re only syncing the test desktop pool, while allowing Seamless SSO to continue to function for existing desktop pools.

How to test Azure AD Hybrid Join, SSO, and PRT

To test the current status of Azure AD Hybrid Join, SSO, and PRT, you can use the following command:

dsregcmd /status

To check if the OS is Hybrid Domain joined, you’ll see the following:

+----------------------------------------------------------------------+
| Device State                                                         |
+----------------------------------------------------------------------+

             AzureAdJoined : YES
          EnterpriseJoined : NO
              DomainJoined : YES
                DomainName : DOMAIN

As you can see above, “AzureADJoined” is “YES”.

Further down the output, you’ll find information related to SSO and PRT Status:

+----------------------------------------------------------------------+
| SSO State                                                            |
+----------------------------------------------------------------------+

                AzureAdPrt : YES
      AzureAdPrtUpdateTime : 2023-07-23 19:46:19.000 UTC
      AzureAdPrtExpiryTime : 2023-08-06 19:46:18.000 UTC
       AzureAdPrtAuthority : https://login.microsoftonline.com/XXXXXXXX-XXXX-XXXXXXX
             EnterprisePrt : NO
    EnterprisePrtAuthority :
                 OnPremTgt : NO
                  CloudTgt : YES
         KerbTopLevelNames : XXXXXXXXXXXXX

Here we can see that “AzureAdPrt” is YES which means we have a valid Primary Refresh Token issued by Azure AD SSO because of the Hybrid Join.