Once I upgraded my Synology NAS to DSM 6.2 I started to experience frequent lockups and freezing on my DS1813+. The Synology DS1813+ would become unresponsive and I wouldn’t be able to SSH or use the web GUI to access it. In this state, NFS sometimes would become unresponsive.
When this occured, I would need to press and hold the power button to force it to shutdown, or pull the power. This is extremely risky as it can cause data corruption.
I’m currently running DSM 6.2.2-24922 Update 2.
This occurred for over a month until it started to interfere with ESXi hosts. I also noticed that the issue would occur when restarting any of my 3 ESXi hosts, and would definitely occur if I restarted more than one.
During the restarting, while logged in to the web GUI and SSH, I was able to see that the memory (RAM) usage would skyrocket. Finally the kernel would panic and attempt to reduce memory usage once the swap file had filled up (keep in mind my DS1813+ has 4GB of memory).
Analyzing “top” as well as looking at processes, I noticed the Synology index service was causing excessive memory and CPU usage. On a fresh boot of the NAS, it would consume over 500MB of memory.
In my case, I only use my Synology NAS for an NFS/iSCSI datastore for my ESXi environment, and do not use it for SMB (Samba/File Shares), so I don’t need the indexing service.
I went ahead and SSH’ed in to the unit, and ran the following commands to turn off the service. Please note, this needs to be run as root (use “sudo su” to elevate from admin to root).
synoservice --disable pkgctl-SynoFinder
While it did work, and the memory was instantly freed, the setting did not stay persistant on boot. To uninstalling the Indexing service, run the following command.
synopkg uninstall SynoFinder
Doing this resolved the issue and freed up tons of memory. The unit is now stable.
After troubleshooting I noticed that the majority of stability issues would start occurring when ESXi hosts accessing NFS exports on the Synology diskstation are restarted.
I went ahead and stopped using NFS, started using iSCSI with MPIO, and the stability of the Synology NAS has greatly improved. I will continue to monitor this.
I still have plans to hack the Synology NAS and put my own OS on it.
Today I had to restart my 3 ESXi hosts that are connected to the NFS export on the Synology Disk Station. After restarting the hosts, the Synology device has gone in to a lock-up state once again. It appears the issue is still present.
The device is responding to pings, and still provides access to SMB and NFS, but the web GUI, SSH, and console access is unresponsive.
I’m officially going to start planning on either retiring this device as this is unacceptable, especially in addition to all the issues over the years, or I may try an attempt at hacking the Synology Diskstation to run my own OS.
After a few more serious crashes and lockups, I finally decided to do something about this. I went ahead and backed up my data, deleted the arrays, performed a factory reset on the Synology Disk Station. I also zero’d the metadata and MBR off all the drives.
I then configured the Synology NAS from scratch, used Btrfs (instead of ext4), restored the backups.
The NAS now appears to be running great and has not suffered any lockups or crashses since. I’ve also been noticing that memory management is working a lot better.
I have a feeling that this issue was caused due to the long term chaining of updates (numerous updates installed over time), or the use of the ext4 filesystem.
As of March 2020 this issue is still occurring on numerous new firmware updates and version. I’ve tried reaching out to Synology on twitter directly a few times about this issue as well as e-mail (indirectly regarding something else) and have still not received or heard a response. As of this time the issue is still occurring on a regular basis on DSM 6.2.2-24922 Update 4. I’ve taken production and important workloads of the device since I can’t have the device continuously crashing or freezing overnight.
My Synology NAS has been stable since I applied the fix, however after an uptime of a few weeks, I noticed that when restarting servers, the memory usage does hike up (example, from 6% to 46%). However, with the fixes applied above, the unit is stable and no longer crashes.
While most of us frequently deploy new ESXi hosts, a question and task not oftenly discussed is how to properly decommission a VMware ESXi host. Some might be surprised to… Read More
This guide will outline the instructions to Disable the VMware Horizon Session Bar. These instructions can be used to disable the Horizon Session Bar (also known as the Horizon Client… Read More
Normally, any VMs that are NVIDIA vGPU enabled have to be manually migrated with manual vMotion if a host is placed in to maintenance mode, to evacuate the host. While… Read More
You may experience GPU issues with the VMware Horizon Indirect Display Driver in your environment when using 3rd party applications which incorrectly utilize the incorrect display adapter. This results with… Read More
Today we're going to cover a powerful little NAS being used with VMware; the Synology DS923+ VMware vSphere Use case and Configuration. This little (but powerful) NAS is perfect for… Read More
Today we'll go over how to install the vSphere vCenter Root Certificate on your client system. Certificates are designed to verify the identity of the systems, software, and/or resources we… Read More
View Comments
Having the same issue recently with 1813+ also 4GB unit. Have a fair few things running on it but would have thought it would manage it self as to not become completely unresponsive. Even ping stopped working for half a day then started pinging again but not able to log in using local IP. Will force it to shut down because a soft shutdown doesn't work even after 24 hours the NAS still didn't shut down last time it got like this. Thanks for the article will apply your suggestion.
Please note that pressing power button until you hear a beep does not force shutdown by power cut but gracefully shutdowns the system on Synology devices.
ref. https://www.synology.com/en-global/knowledgebase/DSM/tutorial/Management/What_can_I_do_unresponsive_Synology_NAS
Hi P4,
In my experience, when it's completely unresponsive (kernel panic, or memory overflow), when pressing the power button without a beep it will shutdown (improperly) after 20 seconds or so.
However, yes in most cases if it is responsive, pressing and holding will initiate a beep, where you can then release the button and wait for a graceful shutdown.
Cheers
I have the same problem on ds213j and latest firmware version. DSM 6.2.2-24922 Update 4.
I'm having the exact same issue with a DS2314+ the last couple of months. Never used to happen to me, but recently upgraded from 8 drives to 11 drives and an additional 12 drive (SSD Read Cache), and now I get regular (every 2 weeks?) freezes where I have to hard shut down with power button because the system has become completely unresponsive. After a reboot its fine again for another couple weeks.
If it is indeed a memory problem (have done scans and memory seems healthy) , I think I will try removing my SSD cache to save some Memory, as im not noticing a drastic performance improvement at all. Will remove SSD cache, monitor for a few weeks and report back, if that solves my issue im sure it was a memory issue (as the SSD cache requires memory to function).
Fellow Calgarian here.. I'm seeing the same thing on my DS1812+ after 8 years of faithful service.
Lan light was flashing and all the drives lights solid except one.
I have been running 3Gb of memory and no SSD cache the whole time.
My memory never comes close to being fully utilized.
Nothing in the logs... only recent change is adding a seventh drive a month ago.. identical drive.
Really a puzzle!
Wondering if it's related to this kernal panic cause here I might try their approach if it happens again:
ethtool -K ethX tso off
ethtool -A ethX autoneg off rx off tx off
https://community.spiceworks.com/topic/2055076-synology-unresponsive-until-rebooted
I have similar problem like the ones described above. I'm using DS216+ with 1G RAM. It started several months ago. Every couple of weeks the box becomes not accessible, but the status LED is solid green and the disk LEDs are flashing rapidly both or just one of them and the other is solid.
I don't find anything in the logs. Checked almost every file in the /var/log folder.
Total mystery. I don't run anything rather than the stock packages. I'm using the station as archive and as DLNA server, nothing more.
Any suggestions what I should check or look into?
thanks a lot!
Hi Vladimir,
It sounds sound like your issue is a memory issue. I think it sounds like you have a disk that is failing or failed, but the system is unable to detect it.
In this case, the system starts to freeze while it waits for read/write retry commands to complete, giving the appearance of a lockup.
Maybe check the SMART status of the disks, also check to see if one of the disks is lighting/flashing differently than the others. When this happened to me 7 out of the 8 disks were green, where the failed disk wasn't lit (busy) due to the intensive retry commands.
Cheers
Stephen
Hi,
just to say that for me it seems the memory module was the problem. I've ordered 4GB module and replaced the stock 1GB one. More than 10 days it works like charm, no hanging like before. It degraded so that it was hanging every day. As far as I remember I've never seen the memory usage more than 50% with the 1GB module. Now it stays 10-12% usually.
So it turned out to be the cheaper fix and I didn't had to replace the drives.
cheers,
Vladimir
Is restarting still an issue today?
I'm planing on buying one 1813+ but I'm not really sure about the stability.