Apr 252021
 
Screenshot of a Hybrid Azure AD Joined login

If you’re using Azure AD, and have Hybrid Azure AD joined machines, special considerations must be made with non-persistent VDI workstations and VMs. This applies to Instant Clones on VMware Horizon.

Due to the nature of non-persistent VDI, machines are created and destroyed on the fly with a user getting an entirely new workstation on every login.

Hybrid Azure AD joined workstations not only register on the local domain Active Directory, but also register on the Azure AD (Azure Active Directory).

The Problem

If you have Hybrid Azure AD configured and machines performing the Hybrid Join, this will cause numerous machines to be created on Azure AD, in a misconfigured and/or unregistered state. When the non-persistent instant clone is destroyed and re-created, it will potentially have the same computer name as a previous machine, but will be unable to utilize the existing registration.

This conflict state could potentially make your Azure AD computer OU a mess.

VMware Horizon 8 version 2303 now supports Hybrid Azure AD joined non-persistent instant clones using Azure AD Connect. If you are using an older version, or using a different platform for non-persistent VDI, you’ll need to reference the solution below.

The Solution

Please see below for a few workarounds and/or solutions:

  1. Upgrade to VMware Horizon 8 2303
  2. Use Seamless SSO instead of Hybrid Azure AD join (click here for more information)
  3. Utilize login/logoff scripts to Azure AD join and unjoin on user login/logoff. You may have to create a cleanup script to remove old/stale records from Azure AD as this can and will create numerous computer accounts on Azure AD.
  4. Do not allow non-persistent virtual machines to Hybrid Domain Join. This can be accomplished either by removing the non-persistent VDI computer OU from synchronization with Azure AD Connect (OU Filtering information at https://docs.microsoft.com/en-us/azure/active-directory/hybrid/how-to-connect-sync-configure-filtering) or by disabling the scheduled task to perform an Azure AD join.

In my environment I elected to remove the non-persistent computer OU from Azure AD Connect sync, and it’s been working great. It also keeps my Azure Active Directory nice and clean.

Oct 232020
 
vCSA Update Installation

When updating VMware vCenter vCSA 7.0 U1 (Build 16858589) to vCSA 7.0 U1 (Build 17004997/17005016, Version 7.0.1.00100), you may notice that the update fails and reports issues with pre-update checks.

Pre-update checks done prior to the update will pass and allow you to proceed, however it’s the installation that will fail and crash reporting this error.

After the installation fails, you will no longer be able to log in to the vCSA VAMI reporting the error “Unable to Login” using the root account.

You are able to login via SSH. Resetting the root password via SSH will not resolve this issue.

The Problem

In the past, issues with the root password expiring have caused similar behavior on the vCSA VAMI. Changing the root password does not resolve this specific issue.

Further troubleshooting, it appears that special characters in the root password such as “!”, “.”, and “@” caused this issue to occur in my environment.

I was not able to fix the broken vCSA after the failed update. Access to the vCSA was not possible, however vCenter functions were still operating.

The Solution

To resolve this situation in my environment, I restored a snapshot of the vCSA taken prior to updating.

After restoring the snapshot, I changed the root password for VAMI and restarted the vCSA.

Another snapshot was taken prior to attempting the upgrade, which was now succesfull after removing special characters out of the root password.

Oct 152020
 
VMware vCLS VM in VM List

Did a new VM appear on your VMware vSphere cluster called “vCLS”? Maybe multiple appeared named “vCLS (1)”, “vCLS (2)”, “vCLS (3)” appeared.

VMware vCLS VM in vSphere Cluster Objects
VMware vCLS VM in vSphere Cluster Objects

This could be frightening but fear not, this is part of VMware vSphere 7.0 Update 1.

What is the vCLS VM?

The vCLS virtural machine is essentially an “appliance” or “service” VM that allows a vSphere cluster to remain functioning in the event that the vCenter Server becomes unavailable. It will maintain the health and services of that cluster.

Where did the vCLS VM come from?

The vCLS VM will appear after upgrading to vSphere 7.0 Update 1. I’m assuming it was deployed during the upgrade process.

It does not appear in the standard Cluster, Hosts, and VMs view, but does appear when looking at the vSphere objects VM lists, Storage VM lists, etc…

Is it normal to have more than one vCLS VM?

The vCLS VMs are created when hosts are added to a vSphere Cluster. Up to 3 vCLS VMs are required to run in each vSphere Cluster.

The vCLS VMs will also appear on clusters which contain only one or two hosts. These configurations will result in either 1 or 2 vCLS VMs named “vCLS (1)” and “vCLS (2)”.

A note on licensing in regards to the vCLS VM

For VMware environments that use VM based licensing like vSphere for ROBO (Remote Office Branch Office), vCLS VMs are shown in the licensing interface as counting towards licensed VMs. Please Note that these VMs do not official count towards your purchased licenses as these are VMware System VMs. Please read VMware KB 80472 for more information on this.

More Information on vCLS VMs

For more information and technical specifics, you can visit the link below:

https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vcenterhost.doc/GUID-96BD6016-4BE7-4B1C-8269-568D1555B08C.html

Hope this posts helps, and puts some minds at ease. Your VMware environment has NOT been compromised.

Oct 102020
 

If you’re like me and use an older Nvidia GRID K1 or K2 vGPU video card for your VDI homelab, you may notice that when using VMware Horizon that VMware Blast h264 encoding is no longer being offloaded to the GPU and is instead being encoded via the CPU.

The Problem

Originally when an environment was configured with an Nvidia GRID K1 or K2 card, not only does the card provide 3D acceleration and rendering, but it also offloads the VMware BLAST h264 stream (the visual session) so that the CPU doesn’t have to. This results in less CPU usage and provides a streamlined experience for the user.

This functionality was handled via NVFBC (Nvidia Frame Buffer Capture) which was part of the Nvidia Capture SDK (formerly known as GRID SDK). This function allowed the video card to capture the video frame buffer and encode it using NVENC (Nvidia Encoder).

Ultimately after spending hours troubleshooting, I learned that NVFBC has been deprecated and is no longer support, hence why it’s no longer functioning. I also checked and noticed that tools (such as nvfbcenable) were no longer bundled with the VMware Horizon agent. One can assume that the agent doesn’t even attempt to check or use this function.

Symptoms

Before I was aware of this, I noticed that while 3D Acceleration and graphics were functioning, I was experiencing high CPU usage. Upon further investigation I noticed that my VMware BLAST sessions were not offloading h264 encoding to the video card.

VMware Horizon Performance Tracker
VMware Horizon Performance Tracker with NVidia GRID K1

You’ll notice above that under the “Encoder” section, the “Encoder Name” was listed as “h264 4:2:0”. Normally this would say “NVIDIA NvEnc H264” (or “NVIDIA NvEnc HEVC” on newer cards) if it was being offloaded to the GPU.

Looking at a VMware Blast session (Blast-Worker-SessionId1.log), the following lines can be seen.

[INFO ] 0x1f34 bora::Log: NvEnc: VNCEncodeRegionNvEncLoadLibrary: Loaded NVIDIA SDK shared library "nvEncodeAPI64.dll"
[INFO ] 0x1f34 bora::Log: NvEnc: VNCEncodeRegionNvEncLoadLibrary: Loaded NVIDIA SDK shared library "nvml.dll"
[WARN ] 0x1f34 bora::Warning: GetProcAddress: Failed to resolve nvmlDeviceGetEncoderCapacity: 127
[WARN ] 0x1f34 bora::Warning: GetProcAddress: Failed to resolve nvmlDeviceGetProcessUtilization: 127
[WARN ] 0x1f34 bora::Warning: GetProcAddress: Failed to resolve nvmlDeviceGetGridLicensableFeatures: 127
[INFO ] 0x1f34 bora::Log: NvEnc: VNCEncodeRegionNvEncLoadLibrary: Some NVIDIA nvml functions unavailable, unloading
[INFO ] 0x1f34 bora::Log: NvEnc: VNCEncodeRegionNvEncUnloadLibrary: Unloading NVIDIA SDK shared library "nvEncodeAPI64.dll"
[INFO ] 0x1f34 bora::Log: NvEnc: VNCEncodeRegionNvEncUnloadLibrary: Unloading NVIDIA SDK shared library "nvml.dll"
[WARN ] 0x1f34 bora::Warning: GetProcAddress: Failed to resolve nvmlDeviceGetEncoderCapacity: 127
[WARN ] 0x1f34 bora::Warning: GetProcAddress: Failed to resolve nvmlDeviceGetProcessUtilization: 127
[WARN ] 0x1f34 bora::Warning: GetProcAddress: Failed to resolve nvmlDeviceGetGridLicensableFeatures: 127

You’ll notice it tries to load the proper functions, however it fails.

The Solution

Unfortunately the only solution is to upgrade to newer or different hardware.

The GRID K1 and GRID K2 cards have reached their EOL (End of Life) and are no longer support. The drivers are not being maintained or updated so I doubt they will take advantage of the newer frame buffer capture functions of Windows 10.

Newer hardware and solutions have incorporated this change and use a different means of frame buffer capture.

To resolve this in my own homelab, I plan to migrate to an AMD FirePro S7150x2.

Jul 222020
 
VMware Logon

When upgrading from any version of VMware vCSA to version 7.0, you may encounter a problem during the migration phase and be asked to specifiy a new “Export Directory”.

I’ve seen this occur on numerous upgrades and often find the same culprit causing the issue. I’ve found a very simple fix compared to other solutions online.

The full prompt for this issue is: “Enter a new export directory on the source machine below”

The Problem

When you upgrade the vCenter vCSA, the process migrates all data over from the source appliance, to the new vCSA 7 appliance.

This data can include the following (depending on your selection):

  • Configuration
  • Configuration and historical data (events and tasks)
  • Configuration and historical data (events, tasks, and performance metric)
  • vSphere Update Manager (updates, configuration, etc.)

This data can accumulate, especially the VMware vSphere Update Manager.

In the most recent upgrade I performed, I noticed that the smallest option (configuration only) was around 8GB, which is way over the 4.7GB default limit.

Could it be vSphere Update Manager?

I’ve seen VMware VUM cause numerous issues over the years with upgrades. VUM has caused issues upgrading from earlier versions to 6.x, and in this case it caused this issue upgrading to vCSA 7.x as well.

In my diagnosis, I logged in to the SSH console of the source appliance, and noticed that the partitions containing the VUM data (which includes update files) was around 7.4GB. This is the “/storage/updatemgr/” partition.

I wasn’t sure if this was included, but the 8GB of configuration, minus the 7.4GB of VUM data, could technically get me to around 0.6GB for migration if this was in fact included.

In my environment, I have the default (and simple) implementation of VUM with the only customization being the HPE VIBs depot. I figured maybe I should blast away the VUM and start from scratch on VMware vCSA 7.0 to see if this fixes the issue.

The Fix

To fix this issue, I simply completely reset the VMware Update Manager Database.

For details on this process and before performing these steps, please see VMware KB 2147284.

Let’s get to it:

  1. Close the migration window (you can reopen this later)
  2. Log in to your vCSA source appliance via SSH or console
  3. Run the applicable steps as defined in the VMware KB 2147284 to reset VUM (WARNING: commands are version specific). In my case on vCSA 6.5 I ran the following commands:
    1. shell
    2. service-control --stop vmware-updatemgr
    3. /usr/lib/vmware-updatemgr/bin/updatemgr-util reset-db
    4. rm -rf /storage/updatemgr/patch-store/*
    5. service-control --start vmware-updatemgr
  4. Open your web broswer and navigate to https://new-vcsa-IP:5480 and resume the migration. You will now notice a significant space reduction and won’t need to specify a new mount point.

That’s it! You have a shiny new clean VUM instance, and can successfully upgrade to vCSA 7.0 without having to specify a new mount point.

To reconfigure and restore any old configuration to VUM, you’ll do so in the “VMware Lifecycle Management” section of the VMware vCenter Server Appliance interface.

Alternatively, in the rare event it’s not related to the VUM data, you can set the export directory to somewhere in “/tmp/” which is another workaround this issue which may allow you to continue.