Jul 312019
 

Once I upgraded my Synology NAS to DSM 6.2 I started to experience frequent lockups and freezing on my DS1813+. The Synology DS1813+ would become unresponsive and I wouldn’t be able to SSH or use the web GUI to access it. In this state, NFS sometimes would become unresponsive.

When this occured, I would need to press and hold the power button to force it to shutdown, or pull the power. This is extremely risky as it can cause data corruption.

I’m currently running DSM 6.2.2-24922 Update 2.

The cause

This occurred for over a month until it started to interfere with ESXi hosts. I also noticed that the issue would occur when restarting any of my 3 ESXi hosts, and would definitely occur if I restarted more than one.

During the restarting, while logged in to the web GUI and SSH, I was able to see that the memory (RAM) usage would skyrocket. Finally the kernel would panic and attempt to reduce memory usage once the swap file had filled up (keep in mind my DS1813+ has 4GB of memory).

Analyzing “top” as well as looking at processes, I noticed the Synology index service was causing excessive memory and CPU usage. On a fresh boot of the NAS, it would consume over 500MB of memory.

The fix (Please scroll down and see updates)

In my case, I only use my Synology NAS for an NFS/iSCSI datastore for my ESXi environment, and do not use it for SMB (Samba/File Shares), so I don’t need the indexing service.

I went ahead and SSH’ed in to the unit, and ran the following commands to turn off the service. Please note, this needs to be run as root (use “sudo su” to elevate from admin to root).

synoservice --disable pkgctl-SynoFinder

While it did work, and the memory was instantly freed, the setting did not stay persistant on boot. To uninstalling the Indexing service, run the following command.

synopkg uninstall SynoFinder

Doing this resolved the issue and freed up tons of memory. The unit is now stable.

Update May 31st, 2020 – Increased Stability

After troubleshooting I noticed that the majority of stability issues would start occurring when ESXi hosts accessing NFS exports on the Synology diskstation are restarted.

I went ahead and stopped using NFS, started using iSCSI with MPIO, and the stability of the Synology NAS has greatly improved. I will continue to monitor this.

I still have plans to hack the Synology NAS and put my own OS on it.

Update May 2nd, 2020 – It’s still crashing, and really frustrating me

Today I had to restart my 3 ESXi hosts that are connected to the NFS export on the Synology Disk Station. After restarting the hosts, the Synology device has gone in to a lock-up state once again. It appears the issue is still present.

The device is responding to pings, and still provides access to SMB and NFS, but the web GUI, SSH, and console access is unresponsive.

I’m officially going to start planning on either retiring this device as this is unacceptable, especially in addition to all the issues over the years, or I may try an attempt at hacking the Synology Diskstation to run my own OS.

Update April 21st, 2020 – What I thought was the fix

After a few more serious crashes and lockups, I finally decided to do something about this. I went ahead and backed up my data, deleted the arrays, performed a factory reset on the Synology Disk Station. I also zero’d the metadata and MBR off all the drives.

I then configured the Synology NAS from scratch, used Btrfs (instead of ext4), restored the backups.

The NAS now appears to be running great and has not suffered any lockups or crashses since. I’ve also been noticing that memory management is working a lot better.

I have a feeling that this issue was caused due to the long term chaining of updates (numerous updates installed over time), or the use of the ext4 filesystem.

Update March 20th, 2020

As of March 2020 this issue is still occurring on numerous new firmware updates and version. I’ve tried reaching out to Synology on twitter directly a few times about this issue as well as e-mail (indirectly regarding something else) and have still not received or heard a response. As of this time the issue is still occurring on a regular basis on DSM 6.2.2-24922 Update 4. I’ve taken production and important workloads of the device since I can’t have the device continuously crashing or freezing overnight.

Update – August 16th, 2019

My Synology NAS has been stable since I applied the fix, however after an uptime of a few weeks, I noticed that when restarting servers, the memory usage does hike up (example, from 6% to 46%). However, with the fixes applied above, the unit is stable and no longer crashes.

  15 Responses to “Synology Memory Issues and Crashing”

  1. […] memory resources. On my DS1813+ I was having issues with a bug that was causing memory overflows (the post is here), and while dealing with that, I decided to take it a step further and optimize my […]

  2. Having the same issue recently with 1813+ also 4GB unit. Have a fair few things running on it but would have thought it would manage it self as to not become completely unresponsive. Even ping stopped working for half a day then started pinging again but not able to log in using local IP. Will force it to shut down because a soft shutdown doesn’t work even after 24 hours the NAS still didn’t shut down last time it got like this. Thanks for the article will apply your suggestion.

  3. Please note that pressing power button until you hear a beep does not force shutdown by power cut but gracefully shutdowns the system on Synology devices.

    ref. https://www.synology.com/en-global/knowledgebase/DSM/tutorial/Management/What_can_I_do_unresponsive_Synology_NAS

  4. Hi P4,

    In my experience, when it’s completely unresponsive (kernel panic, or memory overflow), when pressing the power button without a beep it will shutdown (improperly) after 20 seconds or so.

    However, yes in most cases if it is responsive, pressing and holding will initiate a beep, where you can then release the button and wait for a graceful shutdown.

    Cheers

  5. […] because I only use the NAS for NFS and iSCSI. This resolved the issue. I created a blog post here to outline how to resolve this. I also further optimized the NAS and memory usage by disabling […]

  6. […] Synology Memory Issues and Crashing […]

  7. I have the same problem on ds213j and latest firmware version. DSM 6.2.2-24922 Update 4.

  8. […] a result of my Synology DS1813+ crashing yet again due to the Synology Memory issues and Crashing that I’ve been regularly experiencing, I finally decided to try hacking the Synology NAS to […]

  9. I’m having the exact same issue with a DS2314+ the last couple of months. Never used to happen to me, but recently upgraded from 8 drives to 11 drives and an additional 12 drive (SSD Read Cache), and now I get regular (every 2 weeks?) freezes where I have to hard shut down with power button because the system has become completely unresponsive. After a reboot its fine again for another couple weeks.

    If it is indeed a memory problem (have done scans and memory seems healthy) , I think I will try removing my SSD cache to save some Memory, as im not noticing a drastic performance improvement at all. Will remove SSD cache, monitor for a few weeks and report back, if that solves my issue im sure it was a memory issue (as the SSD cache requires memory to function).

  10. Fellow Calgarian here.. I’m seeing the same thing on my DS1812+ after 8 years of faithful service.
    Lan light was flashing and all the drives lights solid except one.
    I have been running 3Gb of memory and no SSD cache the whole time.
    My memory never comes close to being fully utilized.
    Nothing in the logs… only recent change is adding a seventh drive a month ago.. identical drive.
    Really a puzzle!

    Wondering if it’s related to this kernal panic cause here I might try their approach if it happens again:

    ethtool -K ethX tso off
    ethtool -A ethX autoneg off rx off tx off

    https://community.spiceworks.com/topic/2055076-synology-unresponsive-until-rebooted

  11. I have similar problem like the ones described above. I’m using DS216+ with 1G RAM. It started several months ago. Every couple of weeks the box becomes not accessible, but the status LED is solid green and the disk LEDs are flashing rapidly both or just one of them and the other is solid.
    I don’t find anything in the logs. Checked almost every file in the /var/log folder.
    Total mystery. I don’t run anything rather than the stock packages. I’m using the station as archive and as DLNA server, nothing more.
    Any suggestions what I should check or look into?
    thanks a lot!

  12. Hi Vladimir,

    It sounds sound like your issue is a memory issue. I think it sounds like you have a disk that is failing or failed, but the system is unable to detect it.

    In this case, the system starts to freeze while it waits for read/write retry commands to complete, giving the appearance of a lockup.

    Maybe check the SMART status of the disks, also check to see if one of the disks is lighting/flashing differently than the others. When this happened to me 7 out of the 8 disks were green, where the failed disk wasn’t lit (busy) due to the intensive retry commands.

    Cheers
    Stephen

  13. Hi,

    just to say that for me it seems the memory module was the problem. I’ve ordered 4GB module and replaced the stock 1GB one. More than 10 days it works like charm, no hanging like before. It degraded so that it was hanging every day. As far as I remember I’ve never seen the memory usage more than 50% with the 1GB module. Now it stays 10-12% usually.
    So it turned out to be the cheaper fix and I didn’t had to replace the drives.

    cheers,
    Vladimir

  14. Is restarting still an issue today?
    I’m planing on buying one 1813+ but I’m not really sure about the stability.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)