I have noticed an issue when after upgrading Microsoft Exchange 2016 CU10 to Exchange 2016 CU11, services may fail to start. This issue can be intermittent, where some restarts are able to start more services, and others restarts fewer. I have observed this on 2 separate Exchange upgrades, both were CU10 to CU11.
Recently, a customer had an issue where a Microsoft Exchange security update bricked their entire Exchange CU10 installation. Files were missing and services would not start (even after manually re-configuring all system services to their prior settings, and force starting). To fix this, we weighed our options and decided the best course of action would be to attempt the latest CU (CU11). This is because each Microsoft Exchange Cumulative update is actually a full installer that completely removes the old version, and installs the new version cleanly.
After installing CU11 we were able to rescue the Exchange installation (services could now start, and functioned), however numerous errors and warnings were now present, and we also noticed that there were some new issues with services.
One service in particular called “Net.Tcp Port Sharing Service”, would occasionally not start in time and cause all the Exchange Services not to start (Exchange is dependent on this services). Other times, this service would start, however random Exchange services would timeout.
Some of the errors and warnings included:
Event ID 7000 Source: Service Control Manager Description: The MSComplianceAudit service failed to start due to the following error: The service did not respond to the start or control request in a timely fashion. Event ID 7009 Source: Service Control Manager Description: A timeout was reached (30000 milliseconds) while waiting for the MSComplianceAudit service to connect. Event ID 7000 Source: Service Control Manager Description: The MSExchangeRepl service failed to start due to the following error: The service did not respond to the start or control request in a timely fashion. Event ID 7009 Source: Service Control Manager Description: A timeout was reached (30000 milliseconds) while waiting for the MSExchangeRepl service to connect.
I also observed that on a few restarts, the services that failed would eventually end up restarting 10-15 minutes later (this only occurred 50% of the time).
Originally I was concerned and believed these issues were related to the original issues the customer experienced, however I upgraded my own Exchange 2016 server to CU11 and experienced the same problems (my instance was a clean fully functioning install). I also attempted to upgrade .NET to version 4.7.2 to see if this had any effect, but it did not.
When you go in to services (services.msc) and manually start the services, Exchange functions perfectly and everything works.
As of yet, I don’t have a proper solution. I did however notice that with my customer’s environment, after it was left to sit overnight (around 8 hours), that subsequent restarts actually were able to start the majority of the services properly. It almost seemed as if it just needed time to fix itself. I’m not sure if this is because of IO load, or some type of Exchange database maintenance, but I’m waiting to see if it clears up on my instance as well after an amount of time. I’ll be keeping this post updated.
UPDATE – October 29th: I’ve confirmed for the 2nd time that the issue resolves at least 6-8 hours after the upgrade. At the end of the day I restarted my machine and everything was functioning properly.
If you are experiencing this issue, or can make a comment on it, please leave a comment on this post!