Oct 282018
 

I have noticed an issue when after upgrading Microsoft Exchange 2016 CU10 to Exchange 2016 CU11, services may fail to start. This issue can be intermittent, where some restarts are able to start more services, and others restarts fewer. I have observed this on 2 separate Exchange upgrades, both were CU10 to CU11.

The Problem

Recently, a customer had an issue where a Microsoft Exchange security update bricked their entire Exchange CU10 installation. Files were missing and services would not start (even after manually re-configuring all system services to their prior settings, and force starting). To fix this, we weighed our options and decided the best course of action would be to attempt the latest CU (CU11). This is because each Microsoft Exchange Cumulative update is actually a full installer that completely removes the old version, and installs the new version cleanly.

After installing CU11 we were able to rescue the Exchange installation (services could now start, and functioned), however numerous errors and warnings were now present, and we also noticed that there were some new issues with services.

One service in particular called “Net.Tcp Port Sharing Service”, would occasionally not start in time and cause all the Exchange Services not to start (Exchange is dependent on this services). Other times, this service would start, however random Exchange services would timeout.

Some of the errors and warnings included:

Event ID 7000
Source: Service Control Manager
Description:
The MSComplianceAudit service failed to start due to the following error: 
The service did not respond to the start or control request in a timely fashion.

Event ID 7009
Source: Service Control Manager
Description:
A timeout was reached (30000 milliseconds) while waiting for the MSComplianceAudit service to connect.

Event ID 7000
Source: Service Control Manager
Description:
The MSExchangeRepl service failed to start due to the following error: 
The service did not respond to the start or control request in a timely fashion.

Event ID 7009
Source: Service Control Manager
Description:
A timeout was reached (30000 milliseconds) while waiting for the MSExchangeRepl service to connect.

 

I also observed that on a few restarts, the services that failed would eventually end up restarting 10-15 minutes later (this only occurred 50% of the time).

Originally I was concerned and believed these issues were related to the original issues the customer experienced, however I upgraded my own Exchange 2016 server to CU11 and experienced the same problems (my instance was a clean fully functioning install). I also attempted to upgrade .NET to version 4.7.2 to see if this had any effect, but it did not.

When you go in to services (services.msc) and manually start the services, Exchange functions perfectly and everything works.

The Solution

As of yet, I don’t have a proper solution. I did however notice that with my customer’s environment, after it was left to sit overnight (around 8 hours), that subsequent restarts actually were able to start the majority of the services properly. It almost seemed as if it just needed time to fix itself. I’m not sure if this is because of IO load, or some type of Exchange database maintenance, but I’m waiting to see if it clears up on my instance as well after an amount of time. I’ll be keeping this post updated.

UPDATE – October 29th: I’ve confirmed for the 2nd time that the issue resolves at least 6-8 hours after the upgrade. At the end of the day I restarted my machine and everything was functioning properly.

 

If you are experiencing this issue, or can make a comment on it, please leave a comment on this post!

  14 Responses to “Exchange 2016 CU11 – Services Fail to Start”

  1. Interesting as I’m just about to apply CU11 to a CU10 environment, (or was) thanks for the warning, 🙂

  2. Glad if the warning helped!

    I wouldn’t say to wait, I’d just recommend doing it on a Friday. After the install, restart the server. Let it sit for 20-30 minutes, do another restart.

    After this, give it 10 minutes to boot, and look at the services. You may need to manually start the Exchange services. Everything should work when you do this.

    After another 24 hours, try restarting the server again, and everything should be working fine!

    I haven’t had any issues with CU11 other than this one (which resolved itself), since I deployed it at the original time of the post.

    Cheers,
    Stephen

  3. Stephen thanks, I have successfully run the setup.exe /PrepareSchema

    But running setup.exe /PrepareAD consistently fails

    gets to about %44 through the process and bombs out ??

  4. Hi David,

    That’s odd. How did you kick off the install?

    I’d recommend opening an elevated administrative command prompt, navigate to the DVD, and then run setup.exe.

    The install will also try to run PrepareAD and PrepareSchema, it might give you more information. Also, what was the error when it failed PrepareAD?

    Finally, what is your Active Directory domain minimum version set to for the forest and domain? I’m wondering if this may be why it’s failing, but I’d need to see more on the error.

    Cheers

  5. I used an elevated command prompt, changed to the drive that setup.exe was situated on, and then ran setup.exe

    I notice in the event viewer (MsExchange Management) there are references to “Cmdlet failed” Cmdlet Install-ExchangeOrganization” event id 6 Task category General level error

    The Domain minimum version and forest are both what they should be.

  6. Hi David,

    Can you check the event log for any AD communication errors? Also, how many domain controllers do you have? Can you check their event log to see if they are healthy?

    Maybe do a health check with dcdiag and see if anything odd comes up. Also, try restarting your domain controllers, member servers, and then try the install again?

    I’d be more worried something is either wrong with DNS, AD, the network communication, or something else.

    Stephen

  7. I was able to upgrade 10 Exchange 2016 servers from CU7 to CU11 without any issues. Now I am building 5 brand new ones in another data center because we are moving and I ran into this issue… I built them with CU11 and services just keep failing… reboots and everything else doesnt fix it. I wait overnight and the next day they are all running fine. Further reboots do not re-introduce the issue anymore… very weird..

  8. Hi Chris,

    It is bizarre, usually experiencing this I’d just blast away the server and restart the install process… I have absolutely no idea what happens to self correct over a period of time…

    Most people wouldn’t even wait, lol.

    Cheers
    Stephen

  9. I found the root cause Stephen! Since the issue did not go away and continued after additional reboots, we found a bug with HP hardware, etc that is resolved by adding the following registry key on the affected servers:

    [HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\TimeZoneInformation]
    “RealTimeIsUniversal”=dword:00000001

    I looked into this direction because I saw a lot of events after reboot had the wrong time stamp (few hours into the future). I was validating time settings over and over which were correct. This is the only thing that resolved the issue.

  10. Hi Chris,

    That’s interesting and a great find! I use HPe servers but can also confirm that this issue occurred on a Dell environment with ESXi as well.

    Thanks again,
    Stephen

  11. Would be good to apply to a Dell server and see if it also helps. We don’t use anything other than HPE, therefore I cannot confirm.

  12. I had a variant of this issue. My ManageEngine Desktop Server application tried to reapply the CU11 update which hosed my install. Turned all of the Exchange Services to Disabled. My server is a Windows 2012 R2 Standard running on a VMWare 6.0 environment (Simplivity Storage Device).

  13. Hi Chris,

    In events like yours, if the install gets completely hosed, you can try to rescue it by trying to install an even newer CU version (if you can resolve the root of the problem).

    I’ve rescued many Exchange upgrades that were busted (services disabled, nothing works), by just using an even newer CU than the one that bricked the install.

    Just keep in mind that as you try newer ones, you run out until there aren’t any newer ones.

    Cheers,
    Stephen

  14. Just to add fuel to the fire. Getting the same error still.

    Exchange 2016, CU 12
    .NET 4.7.2

    Going to just manually start everything for a few days and see if it will go away.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)