Had a nasty little surprise with one of my clients this afternoon. Two days ago I updated their Sophos UTM (UTM220) to version 9.410-6 without any issues.
However, today I started to receive notifications that services were crashing (specifically ACC device agent).
After receiving a few of these, I logged in to check it out. Immediately there was no visible errors on the UTM itself, but after some further digging, I noticed these event logs in the “System Messages” log file:
2017:02:06-17:09:32 mail partitioncleaner[7918]: automatic cleaning for partition /tmp started (inodes: 0/100 blocks: 100/85)
2017:02:06-17:09:32 mail partitioncleaner[7918]: stopping deletion: can’t delete more files
Looks like a potential storage problem? Yes it was, but slightly more complicated.
I enabled SSH on the UTM and issued the “df” command (show’s volume usage), and found that the /tmp volume was 100% full.
Doing a “ls” and “ls -hl”, I found there were 25+ files that were around 235MB in size called: “AV-malware-names-XXXX-XXXXXX”.
Restarting the unit clears those files, however they come back shortly (I noticed it would add one every 5-10 minutes).
After some further digging (still haven’t heard back from Sophos on the support case), I came across some other users experiencing the same issues. While no one found a permanent resolution, they did mention this had to do with the Avira AV engine or possibly the dual scan engine.
Checking the UTM, I noticed that we had the E-Mail scanning configured for dual scan.
Solution (temporary workaround):
I went ahead and configured the E-Mail scanner (the only scanner I had that was using dual scan) to use single scan only. I then restarted the UTM. In my environment the default setting for single scanning is set to “Sophos”.
I am now sitting here with 30 minutes of uptime and absolutely no “AV-malware-names-XXXX-XXXXXX” files created.
I will post an update when I hear back from Sophos support.
Hope this helps someone else!
Update (after original post):
I heard back from Sophos support, this is a known bug in 9.410. The current official workaround is to change to single scan and use the AVIRA engine instead of the Sophos engine.
Update #2:
Received notification this morning of a new firmware update available (Version: 9.411003 – Maintenance Release). While I haven’t installed it, it appears from the Bugfixes notes that it was released to fix this issue:
Fix [NUTM-6804]: [AWS] Update breaks HVM standalone installations
Fix [NUTM-6747]: [Email] SAVI scanner coredumps permanently in MailProxy after update to 9.410
Fix [NUTM-6802]: [Web] New coredumps from httpproxy after update to v9.410
Update #3:
I noticed that this bug was interrupting some mailflow on my Sophos UTM, as well as some of my clients. I went ahead and as an emergency situation, installed 9.411-3.
Things were fine for around 10 hours until I started to receive notification of the HTTP proxy failing and requiring restart. Logging in to the UTM, it was very unresponsive, sometimes completely unresponsive for around 10 minutes. Web browsing was not functioning at all on the internal network behind the UTM.
This issue still hasn’t been resolved. Hopefully we see a stable working fix sometime soon.