Feb 142017
 

Years ago, HPE released the GL200 firmware for their HPE MSA 2040 SAN that allowed users to provision and use virtual disk groups (and virtual volumes). This firmware came with a whole bunch of features such as Read Cache, performance tiering, thin provisioning of virtual disk group based volumes, and being able to allocate and commission new virtual disk groups as required.

(Please Note: On virtual disk groups, you cannot add a single disk to an already created disk group, you must either create another disk group (best practice to create with the same number of disks, same RAID type, and same disk type), or migrate data, delete and re-create the disk group.)

The biggest thing with virtual storage, was the fact that volumes created on virtual disk groups, could span across multiple disk groups and provide access to different types of data, over different disks that offered different performance capabilities. Essentially, via an automated process internal to the MSA 2040, the SAN would place highly used data (hot data) on faster media such as SSD based disk groups, and place regularly/seldom used data (cold data) on slower types of media such as Enterprise SAS disks, or archival MDL SAS disks.

(Please Note: To use the performance tier either requires the purchase of a performance tiering license, or is bundled if you purchase an HPE MSA 2042 which additionally comes with SSD drives for use with “Read Cache” or “Performance tier.)

When the firmware was first released, I had no impulse to try it out since I have 24 x 900GB SAS disks (only one type of storage), and of course everything was running great, so why change it? With that being said, I’ve wanted and planned to one day kill off my linear storage groups, and implement the virtual disk groups. The key reason for me being thin provisioning (the MSA 2040 supports the “DELETE” VAAI function), and virtual based snapshots (in my environment, I require over-commitment of the volume). As a side-note, as of ESXi 6.5, ESXi now regularly unmaps unused blocks when using the VMFS-6 filesystem (if left enabled), which is great for SANs using thin provision that support the “DELETE” VAAI function.

My environment consisted of 2 linear disk groups, 12 disks in RAID5 owned by controller A, and 12 disks in RAID5 owned by controller B (24 disks total). Two weekends ago, I went ahead and migrated all my VMs to the other datastore (on the other volume), deleted the linear disk group, created a virtual disk group, and then migrated all the VMs back, deleted my second linear volume, and created a virtual disk group.

Overall the process was very easy and fast. No downtime is required for this operation if you’re licensed for Storage vMotion in your vSphere environment.

During testing, I’ve noticed absolutely no performance loss using virtual vs linear, except for some functions that utilize the VAAI storage providers which of course run faster on the virtual disk groups since it’s being offloaded to the SAN. This was a major concern for me as block linear based storage is accessed more directly, then virtual disk groups which add an extra level of software involvement between the controllers and disks (block based access vs file based access for the iSCSI targets being provided by the controllers).

Unfortunately since I have no SSDs and no extra room for disks, I won’t be able to try the performance tiering, but I’m looking forward to it in the future.

I highly recommend implementing virtual disk groups on your HPE MSA 2040 SAN!

  27 Responses to “HPE MSA 2040 – The switch from linear disk groups, to virtual disk groups…”

  1. Hi. I’ve a new MSA2040 with a StorageWorks D2700 connected to it via 2x DACs. I’ve then got the 2040 connected to 2x HPE 8/8 Brocade’s in a FC setup and some ESXi’s connected to the switches via HBAs. I’m experimenting with how best to use the storage trays. I only have 30x 600G SAS 10k disks between the two so I played with 24x in the 2040 and the rest in the 2700, and then 15x in the 2040 and 15x in the 2700 but I can’t which makes the most sense. My final experiment will be to go back to 24x in the 2040 and 6x in the 2700 and set up 2x virtual disks, but with the first VD using odd numbered drives starting in the 2040 and ending in the 2700, and the second VD using the even drives. I thought this would provide the best use of the drives but also spreading the spindles across the 2040 and the 2700. What would your opinion be on this?

  2. Hi “New MSA2040 D2700 owner”,

    As for the design of your implementation, you need to also take in to account the type of workloads you will be putting on the MSA and the various disk groups you create.

    First, I wouldn’t recommend alternating disk placement in the disk groups (your comment about even/odd numbering). I would have the disks physically grouped together with each other so they can be easily identified. While the MSA may allow you to do it your way, I’d highly recommend against it. The last thing you want is someone accidentally pulling a wrong disk in the event of a failure when restoring the ordering of a disk group (please reference MSA 2040 documentation on restoring order after disk failure replacement).

    The SAN will have fastest access to the disks directly inside of the MSA 2040, so I would reserve this for high I/O and high bandwidth applications on disk groups. If you plan on adding any SSD disks in the future, I would reserve the disk ports in the MSA for those SSD disks.

    There is a limit to how many disks can be added to a disk group depending on the RAID level chosen, you must also take this in to account.

    Finally, if possible I would recommend spreading the disk groups if possible over the different storage pools. This helps with performance as one controller owns and serves one pool, while the other controller owns and serves the other pool. In the event of a cable or controller failure, the other controller will take ownership (depending on the type of failure) and/or allow access to the other storage pool.

    I hope this helps… If you can provide more details I’ll do my best to answer any questions or provide advise.

    Cheers,
    Stephen

  3. Hi Stephen. I went back to consecutive disks in RAID5, RAID6 and RAID10 to see what speeds I could get but with 8GB HBA, 2x HPE SAN 8/8 switches and round robin path selection policy in ESXi6 I’m only getting between 400MB/s read and write. Any tips on where I could look for throttling or sub-optimal connections? I’ve got one DL380G9 with 1x dual port Emulex 8Gb card, each port connected to a host port on either SAN switch. Each SAN switch has 2x connections to each controller on the MSA2040 (Controller A & B port 1 to SAN switch 1, controller A & B port 2 to SAN switch 2.)

  4. Hi “MSA2040 D2700 owner”,

    When disk groups are created, there is an initialization time and after completion, a virtual disk scrub is initiated. Are you waiting until these tasks are completed before testing speeds?

    I’m not that familiar with FC (I’ve only worked with iSCSI), but I do believe you should be getting faster speeds. Have you read the best practice documents that HPe provides on configuration (they cover absolutely everything).

    If you’re using ESXi (vSphere), you should check to make sure that all “optimized” paths are being used, and that “non-optimized” paths are active but don’t have I/O going over them. You also need to check your configuration of LUNs with the best practice document to make sure everything is configured properly.

    I hope this helps!

    Cheers

  5. Hi Stephen, i have a small cluster of two ESXI 6.5 hosts connected to a MSA 1040 with dual 10 Gb iSCSI via two HPE 1950 OfficeConnect Switches in stack mode.

    I’m having perfomance issues, I have 8 15k 900 GB SAS disks which I’ve tried with difrent combinations RAID 1, RAID 5 (3, 5 and 8 disks) making only one DiskGroup and a single volume mapped to just one host and a testing VM, but the write/read speed from within never pass the 40/120 Mbps mark.

    The mapping seems to be correct, I set the round robin IOPS limit to 1 in ESXI, enable the illegal request for Changing LUN response on the MSA, but nothing increases the performance.

    I even open a report with HPE but they check the logs and everything looks Ok and must be a misconfiguration or something else, the only recommendation they gave is add more spindles or use SSD.

    It would be great if you could give me some base line numbers for IOPS/throughput or any advice to get a better performance from this setup.

    Regards,
    Eder

  6. Hi Eder,

    Sorry to hear you’re having issues. You’ll need to troubleshoot this in stages.

    1) Confirm that the iSCSI links on the SAN are running at 10Gb speeds.
    2) Confirm that the switch is configured properly, and is switching at 10Gbps speeds.
    3) Confirm that the hosts are communicating with the switches at 10Gbps speeds.
    4) Even if the NIC is running at 10Gbps, you need to verify that the ESXi OS is communicating and functioning at 10Gbps.
    5) Make sure jumbo frames is configured on the SAN
    6) Confirm that the special MTU is configured on the ESXi hosts. The MSA2040 jumbo frame size is 8900, I’m assuming it is the same for the 1040 (you’ll need to verify this).
    7) Confirm that your MPIO settings are correct and valid (all paths are valid and functioning, none are created in error).
    8) Confirm that your subnetting is correct and valid for MPIO and path redundancy.

    I benchmarked the controllers on my MSA2040 and received these results:
    https://www.stephenwagner.com/2014/06/07/hp-msa-2040-benchmark-read-write-and-iops/

    You have a 1040 so they shouldn’t be the same, but in the ballpark. That post I did on the benchmark was old, I have since gotten the speed of the entire units (controller a and b combined) to around 1.7GB/sec (around 850MB/sec per controller) in real world scenarios.

    Your measurement of 40Mbps to 120Mbps converts to 5MB/sec to 15 MB/sec which is absolutely horrible. Could you confirm this is correct, or did you mean 40-120MB/sec and not Mb/sec?

    Cheers

  7. Hi Stephen,
    So you were able to delete all of your original linear disk groups without issue? I am hesitant because I thought the 1st disk group contains a management LUN that takes care of tracking the MSA 1040’s storage operations.
    Am I mistaken? Was there no special care besides making sure all the VMs were 1st migrated from the LUNs on the linear disk groups before deleting the linear disk groups?

  8. Hi hards226,

    The management LUN (LUN 0), is for management purposes and not part of the data LUNs. The only consideration that should ever be made with this LUN, is to make sure when creating data LUNs that you start your LUN numbering at 1 or higher (as mentioned in the best practice document).

    As far as I know, deleting LUNs has no effect on LUN 0 (the management LUN), so long as you don’t have them overlapping. If you created data LUNs starting with LUN 0 (overlapping the management LUN) as many have done (since they didn’t read the best practice document), I can’t comment on what will happen. It was fine in my case.

    When I did my conversion, I was one of those people who accidentally started their data LUNs at 0, instead of 1. When I went to migrate to Virtual LUNs, I also used it as an opportunity to renumber my data LUNs and free up LUN 0. Keep in mind, I have an MSA 2040, not sure what is different with the 1040. After deleting, recreating, and doing a +1 on the LUN numbering, I freed up my management LUN number. Please keep in mind, my unit was an iSCSI unit, the behavior on FiberChannel or SAS could be completely different.

    As far as your question on special considerations, you’ll need to take in to account anything that your host OS has that may be mounted to that volume. In my case, I’m using VMware vSphere, so there were many special considerations. Before I could successfully unmount and delete the volume in vSphere, I had to change the persistent log file location, as well as some things with vmkdump (if I remember) before vSphere would let me successfully and gracefully unmount the datastore.

    Keep in mind that after moving the content (or VMs), you’ll need to gracefully unmount, and gracefully delete when you’re ready. DO NOT force unmount. If any errors come up, you’ll need to search for KBs on resolve any issues that come up to get to the point where you can gracefully do this.

    I hope this helps!

    Cheers,
    Stephen

  9. Hi Stephen

    In my case i have two sets of Raid Disk (x2 12 SAS 900gb) virtual Group. Where i dont have a spare disk in none of the sets. Our plan is to change from Virtual to Linear and say specify the spares. How hard is it to do?

  10. Hi Ricardo,

    So just to confirm I’m understanding you: You’re actually going to switch from the advanced virtual disk groups, to the basic linear storage disk groups. And while doing this migration, you’d like to provision spares on the new disk groups.

    This should be fairly simple as long as you migrate the data off the old disk groups, follow the proper procedure to decommission/delete the volume, and then re-create the new volumes and provision a spare. Again, keep in mind that during this procedure you are deleting arrays, so make sure you have a valid backup, and also migrate your data properly.

    A final note: If you are deleting your existing array with no spares, and creating a new one with a spare, the new array will have less available disk space than previous one. Make sure that the data you are migrating is not larger than the new array size.

    Hope this helps.

    Cheers,
    Stephen

  11. Hello,

    i have 12 x 1TB disks in HP MSA 2040 SAn device with two Hosts. can you advice best configuration in terms of number of virtual disk and volumes?

    Dominic

  12. Hi Dominic,

    If you have the dual controller version of the SAN, I’d recommend dividing the Virtual Disks (volumes) across both controllers to take advantage of redundancy, and performance. As for the individual virtual disks and RAID levels, this would depend on your applications you’re running on the SAN, and I can’t comment since I don’t know what you’re using it for.

    Hope that helps!

    Cheers,
    Stephen

  13. Hi Stephen,

    I have a 2052 with 2 SSDs, and 22 1.8TB SAS drives. We want two RAID 10 volumes of 12 disks each, the first volume having the performance tier. When I set up the volumes it would not allow snapshots unless it also enabled over-commit. This is not the same as a 2042 that I set up last year. The issue we had is I created a volume, added a VMware datastore and the client filled it and killed the system. I wondered if you have any advice on setting this up so we can effectively thick-provision the volumes on the SAN but also allow a snapshot?

    Appreciate this may be out of scope, so apologies if you can’t offer advice on this one!

    Thanks

  14. Hi Mike,

    This behavior sounds about right. In order to be able to snapshot without over-committing, you’ll need to make you’re volume way smaller than the amount of free disk space on the actual RAID array. This means you’ll have tons of unutilized space, but you’ll be able to do snapshots.

    This holds true for linear volumes, but it may or may not hold try for virtual volumes.

    If I remember correct (correct me if I’m wrong) , when configuring virtual you allocate all the storage to the pool. If you don’t, then you’ll be able to use these for snapshots without over committing, but this needs confirmation…

    In a typical environment you would actually use the over-commit for snapshots, however I’d highly recommend custom configuring the percentage full monitoring values so that notifications are sent to the system admin (these are configurable via the CLI). Even with default values, your client should have received ample warnings that the space was filling before they maxed it out.

    In my own environment, I have scheduled snapshots and have configured custom thresholds which work amazing!

    Hope this helps!

    Stephen

  15. Thanks for the quick reply Stephen. Yes, we allocated all space initially, but for some reason even with a thick-provisioned disk in VMware it filled the SAN. I dislike that it now forces overcommit without really allowing any choice!

    The issue with notification is that the client dumped 8TB of data overnight and filled it. But on a thick disk I’m struggling to understand why the SAN did actually fill up.

    Thanks again for your advice.

  16. Was the thick provisioned vmdk the only vmdk on the VMFS volume? If there were others I believe this may have been the culprit.

    Also, keep in mind that with other thin provisioned disks, even when deleting storage, it’s not actually released back to the SAN until an DELETE/UNMAP command is issued (which marks it on the SAN as unused).

    Further to this, I believe on the MSA SANs, that the command has to be manually ran on the ESXi host to release this back to the storage.

    In your environment I’d recommend looking at the SAN storage utilization, compare it to your VMDK utilization, and see if something is using up storage without you knowing about it.

    Technically, when deploying the SAN and allocating the storage (especially VMDKs), even using over-commitment you should have anything go wrong as technically your client shouldn’t have been able to exceed the space you set for them.

  17. Yes, it was the only vmdk. It’s most odd!

    Thanks for the advice 🙂

  18. And just to confirm, if it was the only VMDK, and it was thick provisioned, you didn’t allocate the entire VMFS datastore to it, correct? You made sure to leave room available?

    I’m wondering if there was overhead, or a pre-existing UNMAP job that failed. At one point in time, I had an issue where a failed manual UNMAP attempt failed, and royally screwed my VMFS volume and the available storage. I ended up having to Storage vMotion the VMs off the VMFS volume, delete/re-create, and them move back.

  19. It’s really odd. The volume is 10TB, and the datastore is 8TB. The SAN was added a couple of weeks ago to a single host using an HBA purely to create a new large data drive on their file server.

    I’ve factory reset it today and set up all over again, so once it’s finished initialising we’ll see how it goes!

  20. Keep me posted on this!

    By the way, if you’re using virtual volumes on the SAN, why are you thick provisioning the vmdk? You’re losing quite a bit of features by doing this, and the performance loss would be negligible switching to thin provisioned disks.

    I could be wrong, but by using virtual the performance loss is erased with using thick/thin.

  21. No option but to use virtual on the SAN now, they’ve removed the linear option. The drive created is being hammered with mapping data and they are dumping TBs of data on and taking it off again almost daily, so we wanted a fixed limit they can work to. They have roughly 7TB of mapping data that they have per project, it gets dumped from USB onto this drive, manipulated and then moved onto an archive system, so the data usage is always going to be ~7TB.

  22. Hi Stephen. I’m running into some issues with the MSA 2040. I’m just starting to set up a couple DL280 Gen10’s and the SAN for a small ESXI environment. I only want one volume on the SAN that can be used by both hosts. The trouble is that when I format the volume from one of the hosts to VMFS it never shows up on the other host as a usable volume. They’re all running FC through a brocade. It doesn’t matter which one I format to VMFS with, there it stays and the other can’t access it after that.

    I set up 5 Dell MD3800F’s exactly the same way, and all of the hosts can access all of the volumes available on each of them but the Dell actually had a setting for “VMWare”

    I did a little research on this, and was looking into explicit vs. default connections, but don’t see that in the web menu.

    Any thoughts?

  23. Hi Greg,

    I’m not too familiar with fiberchannel, but I’ll do my best to help out. I just want to go over a couple things (I’m used to iSCSI, so I hope none of these are specific to iSCSI):

    1) When you created the volume on the SAN itself, and then create the host mappings, did you start at LUN 1? The controllers themselves use LUN 0 but for some reason the numbering starts at 0, when you should manually choose 1 or higher.

    2) Did you create a host group for the mapping? I’d recommend doing this to make sure the settings are the same for both hosts.

    3) After you did format VMFS, on the other host did you select “rescan host storage” or do a scan on all the HBAs on the other host? It’s recommended to do this as it should detect the VMFS volume and then mount it.

    Cheers,
    Stephen

  24. Hi Steven

    I am brand new to working with SANS. One of our new clients has a HP MSA 1040 SAN that had a disk that needed to be replaced. Here is the scenario:

    1. Drive has been replaced with same specs
    2. All slots are capped out.
    3. Thee drive that we replaced LED status is a solid Green. Looks OK within the management interface.
    4. There are 3 Pools with 4 disks in each. Linear . 1 Volume
    5. The 2nd Disk group is degraded – error message
    – Reason: “The disk group is not fault tolerant. Reconstruction cannot start because there is no spare disk
    available of the proper type and size”
    – Recommendation: Replace the disk with one of the same type (SAS, SSD) enterprise SAS, or midline SAS) and the
    same or greater capacity. For continued optimum I/O performance, the replacement disk should have
    performance that is the same as or better than the one it is replacing
    – Configure the new disk as a spare so the system can start reconstructing the vdisk.
    – To prevent this problem in the future, configure one or more additional disks as spare disks
    6. Under the System tab>Front Tab – it shows the replacement drive as the color purple while the others are grey
    7. The disk itself shows as Archive Tier, while the others are linear.

    I know the overall goal now has to do with adding this new disk to Disk group 2. I don’t see any way of doing that/expanding – as if I click on anything
    I am unable to select anything. Not sure if I have to create a new volume or what… According to the error message, it sounds like there should
    be a spare disk and supposed to build off the degraded disk? The issue is there is no slots left or spares. Figured replacing the disk would maybe do it
    but is supposedly not fault tolerant. I’m just very confused on this. Not sure if you can point me in the correct direction or not.

    Thanks.

  25. Hey Scott,

    No worries, I can get your sorted out quickly! 🙂

    There’s a “gotcha” in the MSA documentation, where when a disk fails, simply replacing it doesn’t restart the rebuild process. When you add the new disk, you must add it as a spare, which it will then detect and start rebuilding to restore redundancy.

    Check out my post here which goes in to detail: https://www.stephenwagner.com/2017/01/27/hpe-msa-2040-disk-failure-considerations-and-steps/

    In you case you didn’t already have a spare, so you would remove the faulty, insert the new, asign the new as a global spare, and then rebuilding would start.

    Let me know if that gets you going!

    Cheers,
    Stephen

  26. Hi Stephen.
    Is it possible to add an SSD for Read Cache at some point in the life of the SAN?
    I explain: I have a 2050 and a pool (virtual since you can only create that), and I would like to add this functionality (no licenses needed).
    Obviously I don’t want to destroy the pool. At the moment I don’t see this option from the actions menu, but maybe it’s because the SSD disk is still missing.

  27. Hi Mauro,

    The MSA supports SSD read cache without any additional licenses.

    I think the array supports 2 SSD disks for use with read cache. Once you add the SSDs, you should be able to attach to a virtual disk group to enable the cache.

    Stephen

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)