Connect with me!

Have a question? Want to hire me? Reach out and Connect!
I'm available for remote and onsite consulting!
To live chat with me, Click Here!
Categories: Astaro

Sophos UTM – 9.410-6 services crash and /tmp volume full after firmware upgrade

Had a nasty little surprise with one of my clients this afternoon. Two days ago I updated their Sophos UTM (UTM220) to version 9.410-6 without any issues.

However, today I started to receive notifications that services were crashing (specifically ACC device agent).

After receiving a few of these, I logged in to check it out. Immediately there was no visible errors on the UTM itself, but after some further digging, I noticed these event logs in the “System Messages” log file:

2017:02:06-17:09:32 mail partitioncleaner[7918]: automatic cleaning for partition /tmp started (inodes: 0/100 blocks: 100/85)

2017:02:06-17:09:32 mail partitioncleaner[7918]: stopping deletion: can’t delete more files

Looks like a potential storage problem? Yes it was, but slightly more complicated.

I enabled SSH on the UTM and issued the “df” command (show’s volume usage), and found that the /tmp volume was 100% full.

Doing a “ls” and “ls -hl”, I found there were 25+ files that were around 235MB in size called: “AV-malware-names-XXXX-XXXXXX”.

Restarting the unit clears those files, however they come back shortly (I noticed it would add one every 5-10 minutes).

After some further digging (still haven’t heard back from Sophos on the support case), I came across some other users experiencing the same issues. While no one found a permanent resolution, they did mention this had to do with the Avira AV engine or possibly the dual scan engine.

Checking the UTM, I noticed that we had the E-Mail scanning configured for dual scan.

Solution (temporary workaround):

I went ahead and configured the E-Mail scanner (the only scanner I had that was using dual scan) to use single scan only. I then restarted the UTM. In my environment the default setting for single scanning is set to “Sophos”.

I am now sitting here with 30 minutes of uptime and absolutely no “AV-malware-names-XXXX-XXXXXX” files created.

I will post an update when I hear back from Sophos support.

Hope this helps someone else!

 

Update (after original post):

I heard back from Sophos support, this is a known bug in 9.410. The current official workaround is to change to single scan and use the AVIRA engine instead of the Sophos engine.

Update #2:

Received notification this morning of a new firmware update available (Version: 9.411003 – Maintenance Release). While I haven’t installed it, it appears from the Bugfixes notes that it was released to fix this issue:

 Fix [NUTM-6804]: [AWS] Update breaks HVM standalone installations
Fix [NUTM-6747]: [Email] SAVI scanner coredumps permanently in MailProxy after update to 9.410
Fix [NUTM-6802]: [Web] New coredumps from httpproxy after update to v9.410

Update #3:

I noticed that this bug was interrupting some mailflow on my Sophos UTM, as well as some of my clients. I went ahead and as an emergency situation, installed 9.411-3.

Things were fine for around 10 hours until I started to receive notification of the HTTP proxy failing and requiring restart. Logging in to the UTM, it was very unresponsive, sometimes completely unresponsive for around 10 minutes. Web browsing was not functioning at all on the internal network behind the UTM.

This issue still hasn’t been resolved. Hopefully we see a stable working fix sometime soon.

Stephen Wagner

Stephen Wagner is President of Digitally Accurate Inc., an IT Consulting, IT Services and IT Solutions company. Stephen Wagner is also a VMware vExpert, NVIDIA NGCA Advisor, and HPE Influencer, and also specializes in a number of technologies including Virtualization and VDI.

View Comments

  • Did you have any issues with http proxy restarting with this firmware version?

  • Hi Mauricio,

    I personally did not see it, but I read on Sophos community boards that a couple other users did see the http proxy restarting.

    Cheers,
    Stephen

  • Updated the post to reflect bugfix ID's.

    Mauricio, it appears the update released today should fix the issue you were reporting with the http proxy restarting.

  • Something changed with this update and not for the better. Proxied traffic is definitely sluggish overall and CPU has gone from an average of 5% to 30% since the update was applied on my firewall.

  • Hi Chris,

    Please keep us posted. I'm thinking there's a 50% chance it may just be downloading and installing AV definitions for all the scanning systems. Let me know if this lasts longer than 30 minutes.

    Cheers

  • Hi, thanks for your post. I want to confirm this problem and share my experience.
    Initially I made the usually upgrade to 9.410-6 version and had the problem with data partition filling up quickly. I had an old system (I mean many live upgrades) and decided it's time to reinstall and allocate more space. So I downloaded the last version, did a clean install and restored the settings. The data partition filling was gone (it grows till a limit and then drops back) but the http(s) proxy server restarts continued. Since then I applied each new update but the problem persists. No luck with single AV engine setting. No luck with disable https AV scanning. I noticed restarts during the night when nobody works. Also with the new install I have no unusual CPU usage or something else.
    Still waiting a new patch to resolve the http(s) proxy restarts.

  • Hi chrysosotmos,

    As a temp fix, check to make sure you're not using dual scan on any of the other systems (such as SMTP proxy, etc...). I noticed the UTM is behaving when I set every service on the UTM to single scan. So far I'm at 4 days of uptime on 5 units that were experiencing the issue, and they are behaving since making the change.

    Cheers

  • Hi again,

    I did what you said, but no luck. I even disabled http(s) scanning everywhere and still have proxy restarts. Anyway the dashboard says the antivirus is still active for http/s protocols although I don't know where exactly I left something enabled.

    Any news about this issue? Anything about a new patch?

  • Hi chrysosotmos,

    I've heard of no changes or any updates to the issue. I'm still running with the modifications I made to make things work.

    Just curious, have you tried changing the default AV agent (from Sophos to Avira, or vice versa). I'd recommend restarting after making the change.

    Have you logged in to your UTM with SSH? Checking the logs to find out what's crashing and why? Do you know if it's being caused by memory usage on the HTTP/HTTPS proxy?

    Cheers,
    Stephen

Share
Published by

Recent Posts

How to properly decommission a VMware ESXi Host

While most of us frequently deploy new ESXi hosts, a question and task not oftenly discussed is how to properly decommission a VMware ESXi host. Some might be surprised to… Read More

4 months ago

Disable the VMware Horizon Session Bar

This guide will outline the instructions to Disable the VMware Horizon Session Bar. These instructions can be used to disable the Horizon Session Bar (also known as the Horizon Client… Read More

4 months ago

vGPU Enabled VM DRS Evacuation during Maintenance Mode

Normally, any VMs that are NVIDIA vGPU enabled have to be manually migrated with manual vMotion if a host is placed in to maintenance mode, to evacuate the host. While… Read More

4 months ago

GPU issues with the VMware Horizon Indirect Display Driver

You may experience GPU issues with the VMware Horizon Indirect Display Driver in your environment when using 3rd party applications which incorrectly utilize the incorrect display adapter. This results with… Read More

4 months ago

Synology DS923+ VMware vSphere Use case and Configuration

Today we're going to cover a powerful little NAS being used with VMware; the Synology DS923+ VMware vSphere Use case and Configuration. This little (but powerful) NAS is perfect for… Read More

4 months ago

How to Install the vSphere vCenter Root Certificate

Today we'll go over how to install the vSphere vCenter Root Certificate on your client system. Certificates are designed to verify the identity of the systems, software, and/or resources we… Read More

5 months ago
Powered and Hosted by Digitally Accurate Inc. - Calgary IT Services, Solutions, and Managed Services