Apr 292018
 
Directory Services Restore Mode

Running Veeam Backup and Replication, a Microsoft Windows Server Domain Controller may boot in to safe mode and directory services restore mode.

About a week ago, I loaded up Veeam Backup and Replication in to my test environment. It’s a fantastic product, and it’s working great, however today I had a little bit of an issue with a DC running Windows Server 2016 Server Core.

I woke up to a notification that the backup failed due to a VSS snapshot issue. Now I know that VSS can be a little picky at times, so I decided to restart the guest VM. Upon restarting, she came back up, was pingable, and appeared to be running fine, however the backup kept failing with new errors, the event log was looking very strange on the server, and numerous services that were set to automatic were not starting up.

This specific server was installed using Server Core mode, so it has no GUI and is administered via command prompt over RDP, or via remote management utilities. Once RDP’ing in to the server, I noticed the “Safe Mode” branding on each corner of the display, this was very odd. I restarted the server again, this time manually trying to start Active Directory Services manually via services.msc.

This presented:

Event ID: 16652
Source: Directory-Services-SAM
General Description: The domain controller is booting to directory services restore mode.

Screenshot:

Directory Services Restore Mode

The domain controller is booting to directory services restore mode.

 

This surprised me (and scared me for that matter). I immediately started searching the internet to find out what would have caused this…

To my relief, I read numerous sites that advise that when an active backup is running on a guest VM which is a domain controller, Veeam activates directory services restore mode temporarily, so in the event of a restore, it will boot to this mode automatically. In my case, the switch was not changed back during the backup failure.

Running the following command in a command prompt, verifies that the safeboot switch is set to dsrepair enabled:

bcdedit /v

To disable directory services restore mode, type the following in a command prompt:

bcdedit /deletevalue safeboot

Restart the server and the issue should be resolved!

Apr 232018
 

Ready to jump the gun and upgrade to vSphere 6.7? Hold on a moment…

You’ll remember some time ago VMware announced they are dropping support for vSphere vDP (vSphere Data Protection). If you’re running this in your environment, it will break when upgrading to vSphere 6.7.

A better idea would be to migrate over to a product like Veeam, however please note that as of this date, Veeam does not officially support vSphere 6.7. Support should be coming in the next major update.

Apr 172018
 

With the news of VMware vSphere 6.7 being released today, a lot of you are looking for the download links for the 6.7 download (including vSphere 6.7, ESXi 6.7, etc…). I couldn’t find it myself, but after doing some scouring through alternative URLs, I came across the link.

VMware vSphere 6.7 Download

VMware vSphere 6.7 Download Link

Here’s the link: https://my.vmware.com/web/vmware/info/slug/datacenter_cloud_infrastructure/vmware_vsphere/6_7

HPe Specific (HPe Customization for ESXi) Version 6.7 is available at: https://www.hpe.com/us/en/servers/hpe-esxi.html

Unfortunately the page is blank at the moment, however you can bet the download and product listing will be added shortly!

UPDATE 10:15AM MST: The Download link is now live!

More information on the release of vSphere 6.7 can be found here, here, here, here, here, and here.

An article on the upgrade can be found at: https://blogs.vmware.com/vsphere/2018/05/upgrading-vcenter-server-appliance-6-5-6-7.html

Happy Virtualizing!

Jan 262018
 
10ZiG 5948q Zero Client

In an effort to truly showcase the capabilities of VMware Horizon View and the 10ZiG 5948q Zero Client, I wanted to put together a demo showing the ability to run Red Hat Enterprise Linux (RHEL 7.4) in a VDI environment.

First and foremost, this was super easy to setup. It was almost too easy…

Please see below for video:

You’ll notice during login that after the credential prompt multiple desktops were available (we chose to log on to the RHEL instance) to choose from. You’ll see further on in the video versioning and specifications as well as video playback. Again, please note that my environment does not have any GPU or 3D rendering.

Please Note: The momentary black border on the bottom right side during login, was due to a resolution change on the VDI session. This was the first time logging in with this client, and the border doesn’t normally occur unless changing resolutions.

Equipment/Software used in this demo:

Please note, my company Digitally Accurate Inc, is a VMware Solution Provider Partner, 10ZiG Partner, and Red Hat Ready Business partner. Please don’t hesitate in reaching out for anything VDI! We design, sell, implement, and support VMware and VDI environments!

Jan 202018
 
10ZiG 5948q Zero Client

As promised in my previous post which covered my first impressions, here are some pictures and video of the 10ZiG 5948q zero client in action! In the videos I demonstrate video playback as well as the USB redirection capabilities of the 10ZiG Zero Client and VMware Horizon View. Scroll down to the bottom for videos!

Pictures

10ZiG 5948q Zero Client

10ZiG 5948q Zero Client

 

10ZiG 5948qv Zero Client VMware Horizon View

10ZiG 5948qv Zero Client VMware Horizon View

 

10ZiG 5900 Series Zero Client VMware Horizon View Login

10ZiG 5900 Series Zero Client VMware Horizon View Login

 

10ZiG 5948q Series Zero Client Configuration Menu

10ZiG 5948q Series Zero Client Configuration Menu

 

Videos

In this video, I demonstrate the capability of the 10ZiG 5948q zero client connected to a VMware Horizon View server (via a Unified Access Gateway) playing a video from YouTube. Please note that the ESXi server does not have a GPU and 3D rendering is disabled for the test (this is as low performance as it gets).

In this video, I demonstrate the capability of the 10ZiG 5948q zero client using USB redirection on a live VDI session.

And finally, here’s a video of a 10ZiG zero client cold boot for those that are interested.

And remember, my company Digitally Accurate Inc. is a VMware Solutions Provider and 10ZiG Partner. I’m also regularly posting content on these on the corporate blog as well!

Jan 202018
 
10ZiG 5948q Zero Client

It’s an exciting weekend as I got my hands on 2 new 10ZiG 5948q Series Zero Clients (the 10ZiG 5948qv to be exact). My company Digitally Accurate Inc. in Calgary, Alberta purchased these for internal testing and demo purposes since we sell VMware VDI solutions. I figured I’d do a brief post to offer my first impressions, offer a few pictures, as well as a brief write up.

UPDATED POST – More pictures and videos can be found on a new post here!

For a brief backstory, zero clients are used for environments and businesses that use desktop virtualization (VDI) instead of traditional computers. This means that an employee won’t actually have a computer, instead they will be using a zero client which relays video/keyboard/mouse (it’s actually more complicated than that, lol) to a desktop computer that’s virtualized on a server (either in the cloud, data center, or on-premise). This allows the company to save on hardware, provide better performance to end users, and really simplify a big portion of the IT landscape.

I’m a big fan of VDI, particularly the VMware Horizon View product. My company has a full demo/test VDI environment we have available to showcase this technology.

What’s really awesome is that you don’t necesarily even need to use a zero client. You could instead use an old netbook, laptop, computer, or even a cellphone to connect to these virtualized desktops using the VMware Horizon View client.

The next question you may be asking, is what about graphics quality? Well, you can implement special graphics cards to virtualize GPUs (such as the AMD FirePro MxGPU class of cards) and provide high end graphical capabilities to your users.

In my deployment we’ve been using software based VDI clients, but it was time to get our hands dirty with some zero clients. We decided to partner up with 10ZiG, and order a couple of these units for internal and demo purposes.

 

Now let’s get started!

10ZiG 5948q Zero Client

10ZiG 5948 Zero Client

The 5948q series is part of 10ZiG’s power class of zero clients. This bad boy is the latest and greatest model and supports 3 displays (yup, that’s right) and 4K UHD resolutions. The hardware itself is based off an Intel Braswell (refreshed version) CPU. The device has 1 HDMI port, as well as 2 DisplayPort connections. It also has a microphone, audio line out,  and a ton of USB ports (including both USB2 and USB3) for USB redirection. You can choose from 4GB or 8GB RAM configurations.

The 5948q runs off 10ZiG’s NOS operation system. This is a compact and lightweight operating system that includes the VMware Horizon View client, as well as all the configuration utilities you’ll need to get up and running (as well as troubleshoot the device if needed).

When I received the first two units, I noticed that the boxes were very heavy (I was actually really surprised).

10ZiG 5900q Series Zero Client Box Shot

10ZiG 5900q Series Zero Client Box Shot

 

10ZiG 5948q Series Zero Client Box Shot

10ZiG 5948q Series Zero Client Box Shot

 

The packaging was very nice, and the contents were safely secured with proper cardboard spacers to avoid items shifting during transport.

Finally it was time to take the zero client out! It is a very sleek, and very well put together device (I was actually surprised). As mentioned, it has some weight to it. It just looks great…

10ZiG 5948q Zero Client

 

The bottom of the device swivels and turns to expose 2 hidden USB ports which can be used for any type of USB device. The device also has 2 x USB3.0 ports on the front, an additional 1 x USB2.0 on the front, and 2 x USB2.0 ports on the back.

Plugging in the device and turning it on, it brings you to a big 10ZiG splash image and then on to first time configuration. In the first time configuration you specify your locale and region settings, date and time, and others. One problem I came across, is that when choosing Canada, it defaulted the keyboard to French for some reason. I actually didn’t even realize this until later when trying to log in to my VDI session (it was registering the wrong characters when pressing keys) and rejecting my password. It actually took me over an hour to realize this when I finally started poking around in the console and noticed that the “/” key was triggering an entirely different character. Changing the keyboard to US English fixed this (although in my opinion it should have been the default).

Moving on…

After the first time setup is complete, the interface is very simple yet powerful. You can configure your VMware View Connection Server address (and properties for the connection), as well as all the hardware components of the zero client, this includes: Network, Display, Mouse, Keyboard, USB devices, printers, etc…

After specifying my View connection server and then leaving the control panel, I logged in to my VDI session and it worked great!

Initially I tested some videos on YouTube, which looked great, and then moved on to trying some other things that involved audio and all was perfect. This product was production ready! I also tested the USB redirection by connecting a USB thumb flash drive to a USB port. The Windows 10 VM detected the USB drive, and I was able to partition, format, and use the stick without any problems. Initial testing resulted in 4MB/sec USB redirection speeds.

I’m really hoping that when my budget permits, I can purchase an AMD MxGPU S7150 card so I can test the 10ZiG with some high performance graphic applications. In the meantime though, this works great!

 

Now before I leave you, there are two important things I want to mention:

  1. The 10ZiG zero clients can be completely centrally managed and provisioned by the 10ZiG Manager, which is completely free if you own the products. You’d use this to deploy many zero clients (hundreds, thousands, etc…) in large scale deployments.
  2. You can use the 10ZiG Manager to reflash the firmware of the devices and use them for different desktop virtualization platforms including Citrix XenDesktop, and Microsoft VDI over RDP using Remote Desktop Services (RDS).

 

All in all, I’m very impressed with the device and have absolutely no complaints. I’ll be doing some more write ups and videos on the device soon! Stay posted!

Jan 182018
 

The Problem

I run a Sophos UTM firewall appliance in my VMware vSphere environment and noticed the other day that I was getting warnings on the space used on the ESXi host for the thin-provisioned vmdk file for the guest VM. I thought “Hey, this is weird”, so I enabled SSH and logged in to check my volumes. Everything looked fine and my disk usage was great! So what gives?

After spending some more time troubleshooting and not finding much, I thought to myself “What if it’s not unmapping unused blocks from the vmdk to the host ESXi machine?”. What is unampping you ask? When files get deleted in a guest VM, the free blocks aren’t automatically “unmapped” and released back to the host hypervisor in some cases.

Two things need to happen:

  1. The guest VM has to release these blocks (notify the hypervisor that it’s not using them, making the vmdk smaller)
  2. The host has to reclaim these and issue the unmap command to the storage (freeing up the space on the SAN/storage itself)

On a side note: In ESXi 6.5 and when using VMFS version 6 (VMFS6), the datastores can be configured for automatic unmapping. You can still kick it off manually, but many administrators would prefer it to happen automatically in the background with low priority (low I/O).

Most of my guest VMs automatically do the first step (releasing the blocks back to the host). On Windows this occurs with the defrag utility which issues trim commands and “trims” the volumes. On linux this occurs with the fstrim command. All my guest VMs do this automatically with the exception being the Sophos UTM appliance.

The fix

First, a warning: Enable SSH on the Sophos UTM at your own risk. You need to know what you are doing, this also may pose a security risk and should be disabled once your are finished. You’ll need to “su” to root once you log in with the “loginuser” account.

This fix not only applies to the Sophos UTM, but most other linux based guest virtual machines.

Now to fix the issue, I used the “df” command which provides a list of the filesystems, their mount points, and storage free for those fileystems. I’ve included an example below (this wasn’t the full list):

hostname:/root # df
Filesystem                       1K-blocks     Used Available Use% Mounted on
/dev/sda6                          5412452  2832960   2281512  56% /
udev                               3059712       72   3059640   1% /dev
tmpfs                              3059712      100   3059612   1% /dev/shm
/dev/sda1                           338875    15755    301104   5% /boot
/dev/sda5                         98447760 13659464  79513880  15% /var/storage
/dev/sda7                        129002700  4624468 117474220   4% /var/log
/dev/sda8                          5284460   274992   4717988   6% /tmp
/dev                               3059712       72   3059640   1% /var/storage/chroot-clientlessvpn/dev


You’ll need to run the fstrim command on every mountpoint for file systems “/dev/sdaX” (X means you’ll be doing this for multiple mountpoints). In the example above, you’ll need to run it on “/”, “/boot”, “/var/storage”, “/var/log”, “/tmp”, and any other mountpoints that use “/dev/sdaX” filesytems.

Two examples:

fstrim / -v

fstrim / -v

 

 

fstrim /var/storage -v

fstrim /var/storage -v

 

 

Again, you’ll repeat this for all mount points for your /dev/sdaX storage (X is replaced with the volume number). The command above only works with mountpoints, and not the actual device mappings.

Time to release the unused blocks to the SAN:

The above completes the first step of releasing the storage back to the host. Now you can either let the automatic unmap occur slowly overtime if you’re using VMFS6, or you can manually kick it off. I decided to manually kick it off using the steps I have listed at: https://www.stephenwagner.com/2017/02/07/vmfs-unmap-command-on-vsphere-6-5-with-vmfs-6-runs-repeatedly/

You’ll need to use esxcli to do this. I simply enabled SSH on my ESXi hosts temporarily.

Please note: Using the unmap command on ESXi hosts is very storage I/O intensive. Do this during maintenance window, or at a time of low I/O as this will perform MAJOR I/O on your hosts…

I issue the command (replace “DATASTORENAME” with the name of your datastore):

esxcli storage vmfs unmap --volume-label=DATASTORENAME --reclaim-unit=8

This could run for hours, possibly days depending on your “reclaim-unit” size (this is the block size of the unit you’re trying to reclaim from the VMFS file-system). In this example I choose 8, but most people do something larger like 100, or 200 to reduce the load and time for the command to complete (lower values look for smaller chunks of free space, so the command takes longer to execute).

I let this run for 2 hours on a 10TB datastore, however it may take way longer (possibly 6+ hours, to days).

Finally, not only are we are left with a smaller vmdk file, but we’ve released the space back to the SAN as well!

Jan 062018
 

Last night I updated my VMware VDI envionrment to VMware Horizon 7.4.0. For the most part the upgrade went smooth, however I discovered an issue (probably unrelated to the upgrade itself, and more so just previously overlooked). When connecting with Google Chrome to  VMware Horizon HTML Access via the UAG (Unified Access Gateway), an error pops up after pressing the button saying “Failed to connected to the connection server”.

The Problem:

This error pops up ONLY when using Chrome, and ONLY when connecting through the UAG. If you use a different browser (Firefox, IE), this issue will not occur. If you connect using Chrome to the connection server itself, this issue will not occur. It took me hours to find out what was causing this as virtually nothing popped up when searching for a solution.

Finally I stumbled across a VMware document that mentions on View Connection Server instances and security servers that reside behind a gateway (such as a UAG, or Access Point), the instance must be aware of the address in which browsers will connect to the gateway for HTML access.

The VMware document is here: https://docs.vmware.com/en/VMware-Horizon-7/7.0/com.vmware.horizon-view.installation.doc/GUID-FE26A9DE-E344-42EC-A1EE-E1389299B793.html

To resolve this:

On the view connection server, create a file called “locked.properties” in “install_directory\VMware\VMware View\Server\sslgateway\conf\”.

If you have a single UAG/Access Point, populate this file with:

portalHost=view-gateway.example.com

If you have multiple UAG/Access Points, populate the file with:

portalHost.1=view-gateway-1.example.com
portalHost.2=view-gateway-2.example.com

Restart the server

The issue should now be resolved!

On a side note, I also deleted my VMware Unified Access Gateways VMs and deployed the updated version that ship with Horizon 7.4.0. This means I deployed VMware Unified Access Gateway 3.2.0. There was an issue importing the configuration from the export backup I took from the previous version, so I had to configure from scratch (installing certificates, configuring URLs, etc…), be aware of this issue importing configuration.

 

Oct 272017
 

I went to re-deploy some vDP appliances today and noticed a newer version was made available a few months ago (vSphere Data Protection 6.1.5). After downloading the vSphereDataProtection-6.1.5.ova file, I went to deploy it to my vSphere cluster and it failed due to an invalid certificate and a message reading “The OVF package is signed with an invalid certificate”.

I went ahead and downloaded the certificate to see what was wrong with it. While the publisher was valid, the certificate was only valid from September 5th, 2016 to September 7th, 2017, and today was October 27th, 2017. It looks like the guys at VMware should have generated a new cert before releasing it.

 

 

To resolve this, you need to repackage the OVA file and skip the certificate using the VMware Open Virtualization Format Tool (ovftool) available at https://code.vmware.com/tool/ovf/4.1.0

Once you download and install this, the executable can be found in your Program Files\VMware\VMware OVF Tool folder.

Open a command prompt and change to the above directory and run the following:

ovftool.exe --skipManifestCheck c:\folder\vSphereDataProtection-6.1.5.ova c:\folder\vdpgood.ova

This command will repackage and remove the certificate from the OVA and save it as the new file named vdpgood.ova above. Afterwards deploy it to your vSphere environment and all should be working!

 

Feb 182017
 

This is an issue that effects quite a few people and numerous forum threads can be found on the internet by those searching for the solution.

This can occur both when taking manual snapshots of virtual machines when one chooses “Quiesce guest filesystem”, or when using snapshot based backup applications such as vSphere Data Protection (vSphere vDP).

 

For the last couple days, one of my test VMs (Windows Server 2012 R2) has been experiencing this issue and the snapshot has been failing with the following errors:

An error occurred while taking a snapshot: Failed to quiesce the virtual machine.
An error occurred while saving the snapshot: Failed to quiesce the virtual machine.

As always with standard troubleshooting, I restarted the VM, checked for VSS provider errors, and insured that the Windows Services involved with snapshots were in their correct state and configuration. Unfortunately this had no effect, and everything was configured the way it should be.

I also tried to re-install VMWare tools, which had no effect.

PLEASE NOTE: If you experience this issue, you should confirm the services are in their correct state and configuration, as outlined in VMware KB: 1007696. Source: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007696

 

The Surprise Fix:

In the days leading up to the failure when things were running properly, I did notice that the quiesced snapshots for that VM were taking a long time process, but were still functioning correctly before the failure.

This morning during troubleshooting, I went ahead and deleted all the Windows Volume Shadow Copies which are internal and inside of the Virtual Machine itself. These are the shadow copies that the Windows guest operating system takes on it’s own filesystem (completely unrelated to VMware).

To my surprise after doing this, not only was I able to create a quiesced snapshot, but the snapshot processed almost instantly (200x faster than previously when it was functioning).

I’m assuming this was causing a high load for the VMware snapshot to process and a timeout was being hit on snapshot creation which caused the issue. While Windows volume shadow copies are unrelated to VMware snapshots, they both utilize the same VSS (Volume Shadow Copy Service) system inside of windows to function and process. One must also keep in mind that the Windows volume shadow copies will of course be part of a VMware snapshot.

PLEASE NOTE: Deleting your Windows Volume Shadow copies will delete your Windows volume snapshots inside of the virtual machine. You will lose the ability to restore files and folders from previous volume shadow copy snapshots. Be aware of what this means and what you are doing before attempting this fix.