Oct 202018
 

On an ESXi host when performing a manual unmap on your storage datastore, you may notice a very large (hidden) file on the datastore root called “.asyncUnmapFile”. This file could be taking up terabytes of space, and you aren’t able to delete this file.

asyncUnmapFile

Typically the asyncUnmapFile is used by the UNMAP feature on ESXi hosts to deal with unmapping and unallocating storage blocks on the SAN. When you run a manual UNMAP, this file should be created and should appear to using “0” (no) space (even if it is). When an UNMAP completes, this file should disappear and be automatically removed by the function. If an UNMAP is interupted, this file will not be deleted, allowing you to restart the process and upon a full successful completion, it should then be deleted.

The Problem

Some time ago, I had an issue when performing a manual UNMAP, where the ESXi host became unresponsive (due to memory issues). The command appeared to be completed, however I believe it caused potential issues or corruption on the iSCSI datastore. In subsequent runs, the UNMAP appeared to be functioning and working, however I didn’t realize that the asyncUnmapFile had grown to around 1.5TB.

This was noticed during a SAN storage audit, where we saw that the virtual pool on the SAN was using up way more storage than it should be on the datastore.

When we identified the file was this large and causing issues, we attempted to perform 2 UNMAPs (different reclaim sizes) to see if it would be automatically cleared afterwards. It had no effect and the file was unchanged.

We also tried to modify the permissions on the file, however when trying to delete it, it would report that the file or folder was not found, or that it does not exist. This was concerning as we were worried about potential datastore corruption.

It was also noticed that in the hostd.log and vmkernel.log we saw some errors where the host believed that the blocks on the datastore had already been freed: “on volume labeled ‘iSCSIDatastore01’ already freed by another host: This may be a non-issue”

The Solution

Unfortunately with all the research we did, we couldn’t find a clear-cut solution. With worries that the datastore may be corrupted, we needed to do something.

A decision was ultimately made to Storage vMotion all the VMs (Virtual Machines) to another datastore on a separate storage pool, delete the now empty LUN, and recreate it from scratch. After this, we used Storage vMotion again to move the VMs back.

Instantly I noticed that the VMs on that datastore were running faster (it’s only been 12 hours, so I’ll be adding an update in a few days to confirm). We no longer have the file on any of our datastores.

If anyone has further insight in to this issue, please leave a comment!

Oct 162018
 

In this post, I’ll be going over how to add additional and/or alternative UPN suffixes to your Active Directory. I’ll also be going over why you may require this inside of your environment.

This is also a follow up post to the article here: https://www.stephenwagner.com/2016/09/23/outlook-2016-exchange-2013-password-prompts-upn-and-samaccountname-troubles/ as Microsoft has deleted the KB 243629 article which contained the original instructions.

Why

There is a number of reasons why you may want to do this:

  1. You’re migrating to a newer version of Microsoft Outlook 2016, and require the users UPN to match the users e-mail address for auto-configure to function.
  2. Your internal domain is is a “domain.local” domain, however you want users to log in with a “domain.com” domain.
  3. You are implementing a line of business application or other piece of software that requires user’s UPNs to match their e-mail addresses.
  4. You’re performing a migration.

How

Let’s get to it! Here’s how to add an alternative UPN suffix to an Active Directory domain:

  1. Log on to your domain controller.
  2. Open “Active Directory Domains and Trusts”
  3. On the left hand side of the new window, right click on “Active Directory Domains and Trusts”, and select “Properties” (as shown below).
    Active Directory Domains and Trusts Window

    Active Directory Domains and Trusts Window

     

  4. Type in your new domain suffix in to the “Alternative UPN suffixes” box, and then click “Add”. As shown below.
    Add Alternative UPN suffix

    Add Alternative UPN suffix

     

  5. Click “Apply” and then close out of the windows.

The new UPN suffix should be available via “Active Directory Users and Computers” and you should be able to set it to users.

You can also configure the user accounts via the Exchange Administration Center (EAC). See below for an example:

Exchange Administration Center UPN Suffix

Exchange Administration Center UPN Suffix

 

Oct 122018
 
DNS

In the perfect and properly configured world, every internet user has a reverse DNS entry. This is is the DNS entry which tells people, servers, and services, what any given IP’s hostname is. Also, again in the perfect world, web servers shouldn’t check these, as the DNS query itself usually has to complete before it starts serving website data.

One of the key way’s webmasters and web server administrators increase their web server response times, is to make sure that their server is NOT performing reverse DNS queries when serving the site. However, we aren’t in a perfect world, and many web servers and web sites still perform these queries.

Many web servers do these queries because they are using mis-configured statistic generation software (website stats), default web server configuration files, or other reasons.

The problem

I recently had a discussion with a fellow IT professional where they were having issues with load times when opening websites. They were on a high speed business internet connection, so they figured it had to do with something else. They said they checked absolutely everything, so I decided to see what I could do to help out!

In my own research I noticed that on my own web server (which doesn’t perform reverse DNS queries on users), that numerous visitors both local to North America and abroad, did not actually have properly configured reverse DNS entries. One can deduce that when one of these users visits a website that actually performs an RDNS check during initial connection, it could cause a delay while the server itself waits for the DNS query to be performed (or even worse, timeout).

When further investigating, I also noticed a trend that the larger the company and the more expensive the internet connection, the more IPs that did not have reverse DNS records. I also noticed the IP addresses provided to my colleague did not have RDNS records.

I relayed this information back to my colleague and after they created the proper reverse DNS records, it seemed to help the issue!

Final Note

Since I don’t have direct access to their network, I couldn’t confirm this was the actual issue, or the only issue, but this just goes to show that you should always have your networks (both internal and external) properly configured using leading practices. In the long run, it saves time and avoids issues.

Oct 082018
 
Microsoft Windows Logo

If you are running Microsoft Windows in a domain environment with WSUS configured, you may notice that you’re not able to install some FODs (Features on Demand), or use the “Turn Windows features on or off”. This will stop you from installing things like the RSAT tools, .NET Framework, Language Speech packs, etc…

You may see “failure to download files”, “cannot download”, or errors like “0x800F0954” when running DISM to install packages.

To resolve this, you need to modify your domain’s group policy settings to allow your workstations to query Windows Update servers for additional content. The workstations will still use your WSUS server for approvals, downloads, and updates, however in the event content is not found, it will query Windows Update.

Enable download of “Optional features” directly from Windows Update

  1. Open the group policy editor on your domain
  2. Create a new GPO, or modify an existing one. Make sure it applies to the computers you’d like
  3. Navigate to “Computer Configuration”, “Policies”, “Administrative Templates”, and then “System”.
  4. Double click or open “Specify settings for optional component installation and component repair”
  5. Make sure “Never attempt to download payload from Windows Update” is NOT checked
  6. Make sure “Download repair content and optional features directly from Windows Update instead of Windows Server Update Services (WSUS)” IS checked.
  7. Wait for your GPO to update, or run “gpupdate /force” on the workstations.

Please see an example of the configuration below:

Download repair content and optional features directly from Windows Update instead of Windows Server Update Services (WSUS)

You should now be able to download/install RSAT, .NET, Speech language packs, and more!

Oct 072018
 
Microsoft Windows Logo

Just a few words of warning when upgrading your VMware vSphere Windows 10 virtual machines to Windows 10 Version 1809 (October Update). When upgrading, after the first restart, you may notice multiple BSOD (Blue Screen of Death) with the error “Driver PNP Watchdog”. This will fail the upgrade.

When the upgrade fails, the system will re-attempt until utlimately failing and reverting to the previous version of Windows 10.

In my case, I had a successful upgrade on numerous physical workstations, and a snapshot, so I decided to uninstall both the VMware tools agent, and VMware Horizon View agent. This had no affect and the VM still wouldn’t perform an upgrade.

I’m not sure if it’s the fact that it’s a VM, the VMware tools install, or the VMware Horizon View agent install, however I highly recommend waiting to upgrade until all the bugs get sorted out.

Leave a comment if you have anything to add! 🙂

Oct 052018
 
Microsoft Windows Logo

After you upgrade Microsoft Windows 10 to version 1809 (October Update), you may notice that your RSAT (Remote Server Administration Tools) have uninstalled and that you cannot download or install RSAT on the new version of Windows 10. This is because Microsoft has provided the RSAT tools as part of “Features on Demand” on Windows 10 itself.

Some of you may not be familiar with using the “Features on Demand” or “DISM” tool on Windows, so I decided to write up this little post to assist you in installing RSAT on the latest version of Windows 10.

Install RSAT on Windows 10 1809 (and higher)

To install RSAT on Windows 10 version 1809, open an elevated command and run the following command:

DISM.exe /Online /add-capability /CapabilityName:Rsat.ActiveDirectory.DS-LDS.Tools~~~~0.0.1.0 /CapabilityName:Rsat.BitLocker.Recovery.Tools~~~~0.0.1.0 /CapabilityName:Rsat.CertificateServices.Tools~~~~0.0.1.0 /CapabilityName:Rsat.DHCP.Tools~~~~0.0.1.0 /CapabilityName:Rsat.Dns.Tools~~~~0.0.1.0 /CapabilityName:Rsat.FailoverCluster.Management.Tools~~~~0.0.1.0 /CapabilityName:Rsat.FileServices.Tools~~~~0.0.1.0 /CapabilityName:Rsat.GroupPolicy.Management.Tools~~~~0.0.1.0 /CapabilityName:Rsat.IPAM.Client.Tools~~~~0.0.1.0 /CapabilityName:Rsat.LLDP.Tools~~~~0.0.1.0 /CapabilityName:Rsat.NetworkController.Tools~~~~0.0.1.0 /CapabilityName:Rsat.NetworkLoadBalancing.Tools~~~~0.0.1.0 /CapabilityName:Rsat.RemoteAccess.Management.Tools~~~~0.0.1.0 /CapabilityName:Rsat.RemoteDesktop.Services.Tools~~~~0.0.1.0 /CapabilityName:Rsat.ServerManager.Tools~~~~0.0.1.0 /CapabilityName:Rsat.Shielded.VM.Tools~~~~0.0.1.0 /CapabilityName:Rsat.StorageReplica.Tools~~~~0.0.1.0 /CapabilityName:Rsat.VolumeActivation.Tools~~~~0.0.1.0 /CapabilityName:Rsat.WSUS.Tools~~~~0.0.1.0 /CapabilityName:Rsat.StorageMigrationService.Management.Tools~~~~0.0.1.0 /CapabilityName:Rsat.SystemInsights.Management.Tools~~~~0.0.1.0

*Please Note: If you are using WSUS, you may not be configured to download “optional features” from Windows Update (resulting in “cannot download”, or error “0x800F0954”). To resolve this, please follow the instructions at: https://www.stephenwagner.com/2018/10/08/enable-windows-update-features-on-demand-and-turn-windows-features-on-or-off-in-wsus-environments/

Additional Notes

You’ll notice that by using the command above, we are installing multiple “capabilities”. Below is a list of the capabilities that we install to include the full RSAT feature set:

  • Rsat.ActiveDirectory.DS-LDS.Tools~~~~0.0.1.0
  • Rsat.BitLocker.Recovery.Tools~~~~0.0.1.0
  • Rsat.CertificateServices.Tools~~~~0.0.1.0
  • Rsat.DHCP.Tools~~~~0.0.1.0
  • Rsat.Dns.Tools~~~~0.0.1.0
  • Rsat.FailoverCluster.Management.Tools~~~~0.0.1.0
  • Rsat.FileServices.Tools~~~~0.0.1.0
  • Rsat.GroupPolicy.Management.Tools~~~~0.0.1.0
  • Rsat.IPAM.Client.Tools~~~~0.0.1.0
  • Rsat.LLDP.Tools~~~~0.0.1.0
  • Rsat.NetworkController.Tools~~~~0.0.1.0
  • Rsat.NetworkLoadBalancing.Tools~~~~0.0.1.0
  • Rsat.RemoteAccess.Management.Tools~~~~0.0.1.0
  • Rsat.RemoteDesktop.Services.Tools~~~~0.0.1.0
  • Rsat.ServerManager.Tools~~~~0.0.1.0
  • Rsat.Shielded.VM.Tools~~~~0.0.1.0
  • Rsat.StorageReplica.Tools~~~~0.0.1.0
  • Rsat.VolumeActivation.Tools~~~~0.0.1.0
  • Rsat.WSUS.Tools~~~~0.0.1.0
  • Rsat.StorageMigrationService.Management.Tools~~~~0.0.1.0
  • Rsat.SystemInsights.Management.Tools~~~~0.0.1.0

For more information on this change, you can visit the following URLS:

https://www.microsoft.com/en-ca/download/details.aspx?id=45520

https://docs.microsoft.com/en-us/windows-hardware/manufacture/desktop/features-on-demand-non-language-fod#remote-server-administration-tools-rsat

https://docs.microsoft.com/en-us/windows-hardware/manufacture/desktop/features-on-demand-v2–capabilities

Sep 162018
 
Microsoft Windows Logo

I’ve noticed an issue with Microsoft Windows Server 2016, where a default install, when joined to an Active Directory Domain, will not get it’s time from the domain itself, but rather from “time.windows.com”.

I first noticed this a couple months ago when I had some time issues with one of my Server 2016 member servers. I ran “net time” which reported time from the domain controller, so I simply restarted the VM and it resolved the issue (or so I thought). I did not know there was a larger underlying issue.

While performing maintenance today, I noticed that all Windows Server 2016 VMs were getting their time from “time.windows.com”. When running “w32tm /monitor”, the hosts actually reported the PDC time sources, yet it still used the internet ntp server. I checked all my Windows Server 2012 R2 member servers and they didn’t have the issue. All workstations running Windows 10 didn’t have the issue either.

When this issue occurs, you’ll notice in the event log that the Windows Time Service actually finds your domain controllers as time sources, but then overrides it with the internet server time.windows.com for some reason. The only reference you’ll find pertaining to “time.windows.com”, will be when you run the “w32tm /query /configuration” command.

We need to change the time source from that host to the domain “NT5DS” time source. We’ll do so by resetting the configuration to default settings on the member server.

How to reset the Windows Time Service (w32tm) to default settings

PLEASE NOTE: Only run this on member servers that are experiencing this issue. Do not run this on your domain controller.

  1. Open an elevated (administrative) command prompt
  2. Run the following commands:
    net stop w32time
    w32tm /unregister
    w32tm /register
    net start w32time
  3. Restart the server (may not be needed, but is a good idea)

After doing this, when running “w32tm /query /configuration” you’ll notice the time source will now reflect “NT5DS”, and the servers should now being using your domain hierarchy time sources (domain controllers).

Sep 072018
 
DNS

If you’re experiencing DNS issues (or internet issues) today on September 7 2018, you’re not alone. As of this morning, I’ve been noticing increased traffic coming in to my blog from people searching for DNS issues.

I decided to do a little investigation and noticed numerous people reporting DNS issues in Canada and the United States. While this is being reported by users across North America, I’ve been noticing a trend reporting issues that may be using Canadian hosted DNS Servers.

I will be updating this post below as I find out more information. If you know anything or can contribute any information, please leave a comment below.

Sep 042018
 
Microsoft Windows Logo

Microsoft is ending extended support for Windows 7 on January 14th 2020. With Windows 7 reaching it’s end of life, I highly recommend that you start planning your upgrade from Windows 7 to Windows 10.

When support ends, no more security patches or Windows updates will be available for the product. Expect numerous zero-day exploits to be released shortly after the product reaches EoL.

 

Important Points

  • Test all your applications (line of business applications) compatibility with Windows 10 before deploying
  • Test the OS compatibility on your infrastructure (example, SBS Small Business Server requires modification to support Windows 8 and Windows 10 properly)
  • Compare man-hours and support costs for an upgrade vs the cost of new computers which come pre-installed with Windows 10

More information can be found at https://www.digitallyaccurate.com/blog/2018/09/02/microsoft-windows-7-support-ending-january-2020-windows-7-end-of-life/

Aug 272018
 

So, what happens in a worst-case scenario where your backup system fails, you don’t have any VM snapshots, and the last thing standing in the way of complete data loss is your SAN storage systems LUN snapshots?

Well, first you fire whoever purchased and implemented the backup system, then secondly you need to start restoring the VM (or VMs) from your SAN LUN snapshots.

While I’ve never had to do this in the past (all the disaster recovery solutions I’ve designed and sold have been tested and function), I’ve always been curious what the process is and would be like. Today I decided to try it out and develop a procedure for restoring a VM from SAN Storage LUN snapshot.

For this test I pretended a VM was corrupt on my VMware vSphere cluster and then restored it to a previous state from a LUN snapshot on my HPe MSA 2040 (identical for the HPe MSA 2050, and MSA 2052) Dual Controller SAN.

To accomplish the restore, we’ll need to create a host mapping on the SAN for the LUN snapshot to a new LUN number available to the hosts. We then need to add and mount the VMFS volume (residing on the snapshot) to the host(s) while assigning it a new signature and then vMotion the VM from the snapshot’s VMFS to original datastore.

 

Important Notes (Read first):

  • When mounting a VMFS volume from a SAN snapshot, you MUST RE-SIGNATURE THE SNAPSHOT VMFS volume. Not doing so can cause problems.
  • The snapshot cannot be mapped as read only, VMFS volumes must be marked as writable in order to be mounted on ESXi hosts.
  • You must follow the proper procedure to gracefully dismount and detach the VMFS volume and storage device before removing the snapshot’s host mapping on the SAN.
  • We use Storage vMotion to perform a high-speed move and recovery of the VM. If you’re not licensed for Storage vMotion, you can use the datastore file browser and copy/move from the snapshot VMFS volume to live production VMFS volume, however this may be slower.
  • During this entire process you do not touch, modify, or change any settings on your existing active production LUNs (or LUN numbers).
  • Restoring a VM from a SAN LUN snapshot will restore a crash consistent copy of the VM. The VM when recovered will believe a system crash occurred and power was lost. This is NOT a graceful application consistent backup and restore.
  • Please read your SAN documentation for the procedure to access SAN snapshots, and create host mappings. With the MSA 2040 I can do this live during production, however your SAN may be different and your hosts may need to be powered off and disconnected while SAN configuration changes are made.
  • Pro tip: You can also power on and initialize the VM from the snapshot before initiating the storage vMotion. This will allow you to get production services back online while you’re moving the VM from the snapshot to production VMFS volumes.
  • I’m not responsible if you damage, corrupt, or cause any damage or issues to your environment if you follow these procedures.

We are assuming that you have already either deleted the damaged VM, or removed it from your inventory and renamed the VMs folder on the live VMFS datastore to change the name (example, renaming the folder from “SRV01” to “SRV01.bad”. If you renamed the damaged VM, make sure you have enough space for the new restored VM as well.

Procedure:

Mount the VMFS volume on the LUN snapshot to the ESXi host(s)
  1. Identify the VM you want to recover, write it down.
  2. Identify the datastore that the VM resides on, write it down.
  3. Identify the SAN and identify the LUN number that the VMFS datastore resides on, write it down.
  4. Identify the LUN Snapshot unique name/id/number and write it down, confirm the timestamp to make sure it will contain a valid recovery point.
  5. Log on to the SAN and create a host mapping to present the snapshot (you recorded above) to the hosts using a new and unused LUN number.
  6. Log on to your ESXi host and navigate to configuration, then storage adapters.
  7. Select the iSCSI initator and click the “Rescan Storage Adapters” button to rescan all iSCSI LUNs.

    VMware ESXi Host Rescan Storage Adapter

    VMware ESXi Host Rescan Storage Adapter

  8. Ensure both check boxes are checked and hit “Ok”, wait for the scan to complete (as shown in the “Recent Tasks” window.

    VMware ESXi Host Rescan Storage Adapter Window for VMFS Volume and Devices

    VMware ESXi Host Rescan Storage Adapter Window for VMFS Volume and Devices

  9. Now navigate to the “Datastores” tab under configuration, and click on the “Create a new Datastore” button as shown below.

    VMware ESXi Host Add Datastore Window

    VMware ESXi Host Add Datastore Window

  10. Continue with “VMFS” selected and select continue.
  11. In the next window, you’ll see your existing datastores, as well as your new datastore (from the snapshot). You can leave the “Datastore name” as is since this value will be ignored. In this window you’re going to select the new VMFS datastore from the snapshot. Make sure you confirm this by looking at the LUN number, as well as the value under “SnapshotVolume”. It is critical that you select the snapshot in this window (it should be the new LUN number you added above).
  12. Select next and continue.
  13. On the next window “Mount Option”, you need to change the radio button to and select “Assign a new signature”. This is critical! This will assign a new signature to differentiate it from your existing real production datastore so that the ESXi hosts don’t confuse it.
  14. Continue with the wizard and complete the mount process. At this point ESXi will resignture the VMFS volume and rename it to “snap-OriginalVolumeNameHere”.
  15. You can now browse the VMFS datastore residing on the LUN snapshot and do anything you’d normally be able to do with a normal datastore.
Copy/Move/vMotion the VM from the snapshot VMFS volume to your production VMFS volume

Note: The next steps are only if you are licensed for storage vMotion. If you aren’t you’ll need to use the copy or move function in the file browsing area to copy or move the VMs to your live production VMFS datastores:

  1. Now we’ll go to the vCenter/ESXi host storage area in the web client, and using the “Files” tab, we’ll browse the snapshots VMFS datastore that we just mounted.
  2. Locate the folder for the VM(s) you want to recover, open the folder, right click on the vmx file for the VM and select “Register VM”. Repeat this for any of the VMs you want to recover from the snapshot. Complete the wizard for each VM you register and add it to a host.
  3. Go back to you “Hosts and VMs” view, you’ll now see the VMs are added.
  4. Select and right click on the VM you want to move from the snapshot datastore to your production live datastore, and select “Migrate”.
  5. In the vMotion migrate wizard, select “Change Storage only”.
  6. Continue to the wizard, and storage vMotion the VM from the snapshot VMFS to your production VMFS volume. Wait for the vMotion to complete.
  7. After the storage vMotion is complete, boot the VM and confirm everything is functioning.
Gracefully unmount, detach, and remove the snapshot VMFS from the ESXi host, and then remove the host mapping from the SAN
  1. On each of your ESXi hosts that have access to the SAN, go to the “Datastores” section under the ESXi hosts configuration, right click on the snapshot VMFS datastore, and select “Unmount”. You’ll need to repeat this on each ESXi host that may have automounted the snapshot’s VMFS volume.
  2. On each of your ESXi hosts that have access to the SAN, go to the “Storage Devices” section under the ESXi hosts configuration and identify (by LUN number) the “disk” that is the snapshot LUN. Select and highlight the snapshot LUN disk, select “All Actions” and select “Detach”. Repeat this on each host.
  3. Double check and confirm that the snapshot VMFS datastore (and disk object) have been unmounted and detached from each ESXi host.
  4. You can now log in to your SAN and remove the host mapping for the snapshot-to-LUN. We will not longer present the snapshot LUN to any of the hosts.
  5. Back to the ESXi hosts, navigate to “Storage Adapters”, select the “iSCSI Initiator Adapter”, and click the “Rescan Storage Adapters”. Repeat this for each ESXi host.

    VMware ESXi Host Rescan Storage Adapter

    VMware ESXi Host Rescan Storage Adapter

  6. You’re done!