Connect with me!

Have a question? Want to hire me? Reach out and Connect!
I'm available for remote and onsite consulting!
To live chat with me, Click Here!
VMware

Synology DS1813+ – iSCSI MPIO Performance vs NFS

Recently I decided it was time to beef up my storage link between my demonstration vSphere environment and my storage system. My existing setup included a single HP DL360p Gen8, connected to a Synology DS1813+ via NFS.

I went out and purchased the appropriate (and compatible) HP 4 x 1Gb Server NIC (Broadcom based, 4 ports), and connected the Synology device directly to the new server NIC (all 4 ports). I went ahead and configured an iSCSI Target using a File LUN with ALUA (Advanced LUN features). Configured the NICs on both the vSphere side, and on the Synology side, and enabled Jumbo frames of 9000 bytes.

I connected to the iSCSI LUN, and created a VMFS volume. I then configured Round Robin MPIO on the vSphere side of things (as always I made sure to enable “Multiple iSCSI initators” on the Synology side).

I started to migrate some VMs over to the iSCSI LUN. At first I noticed it was going extremely slow. I confirmed that traffic was being passed across all NICs (also verified that all paths were active). After the migration completed I decided to shut down the VMs and restart to compare boot times. Booting from the iSCSI LUN was absolutely horrible, the VMs took forever to boot up. Keep in mind I’m very familiar with vSphere (my company is a VMWare partner), so I know how to properly configure Round Robin, iSCSI, and MPIO.

I then decided to tweak some settings on the ESXi side of things. I configured the Round Robin policy to IOPS=1, which helped a bit. Then changed the RR policy to bytes=8800 which after numerous other tweaks, I determined achieved the highest performance to the storage system using iSCSI.

This config was used for a couple weeks, but ultimately I was very unsatisfied with the performance. I know it’s not very accurate, but looking at the Synology resource monitor, each gigabit link over iSCSI was only achieving 10-15MB/sec under high load (single contiguous copies) that should have resulted in 100MB/sec and higher per link. The combined LAN throughput as reported by the Synology device across all 4 gigabit links never exceeded 80MB/sec. File transfers inside of the virtual machines couldn’t get higher then 20MB/sec.

I have a VMWare vDP (VMWare Data Protection) test VM configured, which includes a performance analyzer inside of the configuration interface. I decided to use this to test some specs (I’m too lazy to actually configure a real IO/throughput test since I know I won’t be continuing to use iSCSI on the Synology with the horrible performance I’m getting). The performance analyzer tests run for 30-60 minutes, and measure writes and reads in MB/sec, and Seeks in seconds. I tested 3 different datastores.

Synology  DS1813+ NFS over 1 X Gigabit link (1500MTU):

  • Read 81.2MB/sec, Write 79.8MB/sec, 961.6 Seeks/sec

Synology DS1813+ iSCSI over 4 x Gigabit links configured in MPIO Round Robin BYTES=8800 (9000MTU):

  • Read 36.9MB/sec, Write 41.1MB/sec, 399.0 Seeks/sec

Custom built 8 year old computer running Linux MD Raid 5 running NFS with 1 X Gigabit NIC (1500MTU):

  • Read 94.2MB/sec, Write 97.9MB/sec, 1431.7 Seeks/sec

Can someone say WTF?!?!?!?! As you can see, it appears there is a major performance hit with the DS1813+ using 4 Gigabit MPIO iSCSI with Round Robin. It’s half the speed of a single link 1 X Gigabit NFS connection. Keep in mind I purchased the extra memory module for my DS1813+ so it has 4GB of memory.

I’m kind of choked I spent the money on the extra server NIC (as it was over $500.00), I’m also surprised that my custom built NFS server from 8 years ago (drives are 4 years old) with 5 drives is performing better then my 8 drive DS1813+. All drives used in both the Synology and Custom built NFS box are Seagate Barracuda 7200RPM drives (Custom box has 5 X 1TB drives configured RAID5, the Synology has 8 x 3TB drives configured in RAID 5).

I won’t be using iSCSI  or iSCSI MPIO again with the DS1813+ and actually plan on retiring it as my main datastore for vSphere. I’ve finally decided to bite the bullet and purchase an HP MSA2024 (Dual Controller with 4 X 10Gb SFP+ ports) to provide storage for my vSphere test/demo environment. I’ll keep the Synology DS1813+ online as an NFS vDP backup datastore.

Feel free to comment and let me know how your experience with the Synology devices using iSCSI MPIO is/was. I’m curious to see if others are experiencing the same results.

UPDATE – June 6th, 2014

The other day, I finally had time to play around and do some testing. I created a new FileIO iSCSI Target, I connected it to my vSphere test environment and configured round robin. Doing some tests on the newly created datastore, the iSCSI connections kept disconnecting. It got to the point where it wasn’t usable.

I scratched that, and tried something else.

I deleted the existing RAID volume and I created a new RAID 5 volume and dedicated it to Block I/O iSCSI target. I connected it to my vSphere test environment and configured round robin MPIO.

At first all was going smoothly, until again, connection drops were occurring. Logging in to the DSM, absolutely no errors were being reported and everything was fine. Yet, I was at a point where all connections were down to the ESXi host.

I shut down the ESXi host, and then shut down and restarted the DS1813+. I waited for it to come back up however it wouldn’t. I let it sit there and waited for 2 hours for the IP to finally be pingable. I tried to connect to the Web interface, however it would only load portions of the page over extended amounts of time (it took 4 hour to load the interface). Once inside, it was EXTREMELY slow. However it was reporting that all was fine, and everything was up, and the disks were fine as well.

I booted the ESXi host and tried to connect to it, however it couldn’t make the connection to the iSCSI targets. Finally the Synology unit became un-responsive.

Since I only had a few test VMs loaded on the Synology device, I decided to just go ahead and do a factory reset on the unit (I noticed new firmware was available as of that day). I downloaded the firmware, and started the factory reset (which again, took forever since the web interface was crawling along).

After restarting the unit, it was not responsive. I waited a couple hours and again, the web interface finally responded but was extremely slow. It took a couple hours to get through the setup page, and a couple more hours for the unit to boot.

Something was wrong, so I restarted the unit yet again, and again, and again.

This time, the alarm light was illuminated on the unit, also one of the drive lights wouldn’t come on. Again, extreme unresponsiveness. I finally got access to the web interface and it was reporting the temperature of one of the drives as critical, but it said it was still functioning and all drives were OK. I shut off the unit, removed the drive, and restarted it again, all of a sudden it was extremely responsive.

I removed the drive, hooked it up to another computer and confirmed that it was failed (which it was).

I replaced the drive with a new one (same model), and did three tests. One with NFS, one with FileIO iSCSI, and one with BlockIO iSCSI. All of a sudden the unit was working fine, and there was absolutely NO iSCSI connections dropping. I tested the iSCSI targets under load for some time, and noticed considerable performance increases with iSCSI, and no connection drops.

Here are some thoughts:

  • Two possible things fixed the connection drops, either the drive was acting up all along, or the new version of DSM fixed the iSCSI connection drops.
  • While performance has increased with FileIO to around ~120-160MB/sec from ~50MB/sec, I’m still not even close to maxing out the 4 X 1Gb interfaces.
  • I also noticed a significant performance increase with NFS, so I’m leaning towards the fact that the drive had been acting up since day one (seeks per second increased by 3 fold after replacing the drive and testing NFS). I/O wait has been significantly reduced
  • Why did the Synology unit just freeze up once this drive really started dying? It should have been marked as failed instead of causing the entire Synology unit not to function.
  • Why didn’t the drive get marked as failed at all? I regularly performed SMART tests, and checked drive health, there was absolutely no errors. Even when the unit was at a standstill, it still reported the drive as working fine.

Either way, the iSCSI connection drops aren’t occurring anymore, and performance with iSCSI is significantly better. However, I wish I could hit 200MB+/sec.

At this point it is usable for iSCSI using FileIO, however I was disappointed with BlockIO performance (BlockIO should be faster, no idea why it isn’t).

For now, I have an NFS datastore configured (using this for vDP backup), although I will be creating another FileIO iSCSI target and will do some more testing.

Update – August 16, 2019: Please see these additional posts regarding performance and optimization:

Stephen Wagner

Stephen Wagner is President of Digitally Accurate Inc., an IT Consulting, IT Services and IT Solutions company. Stephen Wagner is also a VMware vExpert, NVIDIA NGCA Advisor, and HPE Influencer, and also specializes in a number of technologies including Virtualization and VDI.

View Comments

  • Hey,
    Congrats on your new purchase. Couple things you may or may not know already. Make sure to upgrade to the latest DSM 5.0 on Synology's support site. They have made huge improvements to the OS for iSCSI (they are saying like 6x) from what I have read. I just bought my DS1813+ a few weeks back as well and it came with a really old version (I think 4.2) of the OS. I immediately upgraded after reading how much better DSM 5.0 handles iSCSI. IMHO, DSM 5 seems very stable and has a number of added features that might be of interest to you. How many drives do you have and what RAID level/config did you go with? I am using RAID-10 with 8x WD RE4 2TB drives and its fast. I feel I am getting great speeds. I am also using iSCSI MPIO-RR with ESXi 5.5 Update-1. I didn't play the IOPS setting as you mentioned.

    Here are some results from DSM's resource monitor while running Crystal HD Benchmark v3.0.3 x64

    Volume/iSCSI:
    - Utilization: 58
    - Transfer Rate: 128MB/s
    - Peak Read IOPS: Doesn't Register for some reason
    - Peak Write IOPS: 10204

    Network:
    - Peak Sent: 139MB/s
    - Peak Received: 112MB/s

    If you haven't already, look in to creating what Synology calls "Regular Files" iSCSI LUN and enabled the "Advanced LUN Features" option. When you enable this feature, the NAS supports VAAI capabilities. VAAI speeds up your many of your storage operations by offloading the operations to the NAS unit, instead of the hypervisor having to read and rewrite the data back to the NAS, the NAS copies the blocks within itself so the data never travels the network.

    I love my Synology DS1813+

    Good Luck!

  • I should have probably mentioned I am only using 2 of the 4 1GbE ports on the Synology NAS and 2 1GbE uplinks on ESXi with Jumbo Frames enabled on an isolated VLAN.

    Regards,
    Ash

  • Hi Ash,

    Thanks for your comments. I've actually had this running for over a year. Before trying out the iSCSI feature, I was using NFS which provided the highest amount of throughput in my scenario. When I upgraded to DSM 5 a week ago, I noticed no major increase in speed with the iSCSI Target.

    To answer your questions: As I mentioned in the blog post, I was using the Advanced LUN features options. Also, I mentioned that I'm using 8 X 3TB Seagate 7200 drives in RAID 5.

    Please note that the information provided from the Resource Monitor isn't exactly accurate as this is information provided by the Linux kernel running on the Synology device. Real world speeds are actually slower. And to see more accurate speeds (which still will be inaccurate), it's better to look at the "Disk" instead of the "Volume/iSCSI" inside of the resource manager.

    To properly benchmark, you'll need to configure a Virtual Machine and run benchmarks inside of it against the storage, which will actually measure the iSCSI LUN throughput and IOPS. The measurements I provided in the blog post above reflect this.

    Also one other thing I wanted to mention in the blog post which I forgot: under extremely heavy load some of the paths would actually be lost and go offline to the iSCSI target. At one point, 3 of 4 of my paths went offline, I was really nervous I was going to lose access to the LUN.

    Don't get me wrong, I absolutely love the Synology products. I just won't be using them for MPIO iSCSI anytime soon.

    Stephen

  • Steven,

    Sorry, to be honest, I read your blog early on and didn't read over it again once I went to post. I was only trying to be helpful.

    You can measure performance from the VM level as well like you mentioned, and you can also measure IOPS and network throughput with the ESXTOP utility on the hypervisor.

    I am experiencing the problem you described with connections being lost (completely including Web GUI, SSH) when high utilization occurs. I noticed only one Synology NAS is listed on the compatibility support matrix for ESXi 5.5. I started a ticket with Synology recently to follow up on the problem.

    Regards,
    Ash

  • No worries at all! :)

    Just so you know, in my case, I could still access SMB, SSH, NFS, and the Web GUI. It was only the iSCSI connections to the ESXi hosts that were dropping.

  • Ash - did you get anywhere with synology with your issues? I see that DSM 5.0-4482 says "Improved the stability of iSCSI target service". Has that worked for you?

  • Hi,

    Just read your post regarding your Link Aggregated performance on a Synology NAS and iSCSI.

    One thing that has become apparent is a mix of link aggregation methods, your ESXi host is set to use a RoundRobin policy of sending information, however this method is not supported on a Synology NAS, I have checked on My NAS and can see there is either a Failover option or a LACP option, this is the IEEE 802.3ad spec, and uses all the link at the same time.

    Attempting to use the conflicting methods will cause several issues as the Aggregation type doesn't match.

    Additionally, (from unfortunate experience) the type of drive in use will cause performance issues, Desktop and NAS grade drives are not recommended for installations over 4 drives as they have no vibration tolerance and the iops are affected by sympathetic vibrations.

    Based on your blog, The 1st issue I think is your primary concern, as mixing types of Link Aggregation is a BAD idea.

  • Hi Paul,

    Unfortunately there hasn't been any updates. Since installing the latest versions of DSM, there's been no change in performance, and the issues of the links going down.

  • Hi Tim,

    Actually the unit does support MPIO round robin, also I am not using link aggregation. But you are correct that you cannot use both.

    Link Aggregation does not create any performance increases with a single ESX host since both iSCSI and NFS don't create multiple connections, and (as per design) multiple connections are required to see increases in performance with Link Aggregation, this is why link aggregation is seldom used with virtualization for storage purposes, and is why iSCSI MPIO using round robin is the choice for most people who virtualize.

    While link aggregation does increase fault tolerance, people still regularly choose iSCSI MPIO due to performance reasons. Technically link aggregation would increase performance in a multiple host environment with storage, however you could gain even more performance by using MPIO round robin instead (since it creates more connections to each single host).

    Stephen

  • Hi Stephen,

    I spoke to Synology tech support about your issue as I'm keen to find out if the DS1813+ will work for me. They said categorically that round robin isn't supported on their devices. I couldn't see anything in their specs that mentioned it either?

    Cheers,

    Paul.

  • Hi Paul,

    I think you may just be talking to someone who doesn't know. It's definitely supported, and numerous people use it. Synology even has documents setup to explain how to configure, and they advertise the device as VMWare Ready, which is because it supports multiple iSCSI connections and handles it properly.

    Here's 2 documents they have for MPIO with Windows:
    https://www.synology.com/en-us/support/tutorials/552
    http://forum.synology.com/wiki/index.php/How_to_use_iSCSI_Targets_on_Windows_computers_with_Multipath_I/O

    Here's 2 documents they have for MPIO with VMWare ESXi:
    http://www.synology.com/en-global/support/tutorials/551
    http://forum.synology.com/wiki/index.php/How_to_use_iSCSI_Targets_on_VMware_ESXi_with_Multipath_I/O

    These are all support documents they have on their site, and wiki available for customers to configure the units. It's all supported.

    Hope this helps,

    Stephen

  • Hi Stephen,

    Thanks for that. Yes I've just been reading more on it and agree. What I'm not up to speed on yet is round robin - is that another form of MPIO or just another term for MPIO?

    Thanks.

  • Hi Paul,

    Essentially MPIO is just Multiple Path Input Output. I guess you could say this defines multiple iSCSI connections from the initiator to the target (target being iSCSI server). When MPIO is enabled, a host will have multiple connections to the iSCSI target. Keep in mind that this is just a "state" you could call it, of having the multiple connections active.

    Underneath MPIO (part of MPIO), there is numerous different "configurations". I can't remember off the top of my head of the actual names, but there's one for "Recently Used" which will use the last available/used path, there's "Fixed" (which I staticly configured by the user I believe), and then there is "Round Robin" which actually alternates between all available paths. It will send a chunk of data on connection 1, then another chunk on connection 2, etc and circulate, ultimately it provides increased speed, and also provides fail over in case one of the connections go down.

    Now going further in to MPIO and Round Robin, by default RR has a pre-configured static amount of IOPSs it will send until the number is hit, in which case it jumps to the next connection. This can be changed using the VMWare CLI to the amount of IOPS you'd like (most people change this value to 1), you can also change it from IOPS to bytes. Some people do this, and set the byte value to the maximum jumbo frame packet size they have configured on the NICs and SAN network.

  • Hi Stephen

    I am a member of the UK Support Team at Synology UK.

    We would like to help with this and see if we can look into what's happening for you. Please contact us directly at uk_support@synology.com and we will be able to help resolve any performance issues.

    Thank you

Share
Published by

Recent Posts

How to properly decommission a VMware ESXi Host

While most of us frequently deploy new ESXi hosts, a question and task not oftenly discussed is how to properly decommission a VMware ESXi host. Some might be surprised to… Read More

4 months ago

Disable the VMware Horizon Session Bar

This guide will outline the instructions to Disable the VMware Horizon Session Bar. These instructions can be used to disable the Horizon Session Bar (also known as the Horizon Client… Read More

4 months ago

vGPU Enabled VM DRS Evacuation during Maintenance Mode

Normally, any VMs that are NVIDIA vGPU enabled have to be manually migrated with manual vMotion if a host is placed in to maintenance mode, to evacuate the host. While… Read More

4 months ago

GPU issues with the VMware Horizon Indirect Display Driver

You may experience GPU issues with the VMware Horizon Indirect Display Driver in your environment when using 3rd party applications which incorrectly utilize the incorrect display adapter. This results with… Read More

4 months ago

Synology DS923+ VMware vSphere Use case and Configuration

Today we're going to cover a powerful little NAS being used with VMware; the Synology DS923+ VMware vSphere Use case and Configuration. This little (but powerful) NAS is perfect for… Read More

4 months ago

How to Install the vSphere vCenter Root Certificate

Today we'll go over how to install the vSphere vCenter Root Certificate on your client system. Certificates are designed to verify the identity of the systems, software, and/or resources we… Read More

5 months ago
Powered and Hosted by Digitally Accurate Inc. - Calgary IT Services, Solutions, and Managed Services