May 012021
 
Picture of NVMe Storage Server Project

For over a year and a half I have been working on building a custom NVMe Storage Server for my homelab. I wanted to build a high speed storage system similar to a NAS or SAN, backed with NVMe drives that provides iSCSI, NFS, and SMB Windows File Shares to my network.

The computers accessing the NVMe Storage Server would include VMware ESXi hosts, Raspberry Pi SBCs, and of course Windows Computers and Workstations.

The focus of this project is on high throughput (in the GB/sec) and IOPS.

The current plan for the storage environment is for video editing, as well as VDI VM storage. This can and will change as the project progresses.

The History

More and more businesses are using all-flash NVMe and SSD based storage systems, so I figured there’s no reason why I can’t have build and have my own budget custom all NVMe flash NAS.

This is the story of how I built my own NVMe based Storage Server.

The first version of the NVMe Storage Server consisted of the IO-PEX40152 card with 4 x 2TB Sabrent Rocket 4 NVMe drives inside of an HPE Proliant DL360p Gen8 Server. The server was running ESXi with TrueNAS virtualized, and the PCIe card passed through to the TrueNAS VM.

The results were great, the performance was amazing, and both servers had access to the NFS export via 2 x 10Gb SFP+ networking.

There were three main problems with this setup:

  1. Virtualized – Once a month I had an ESXi PSOD. This was either due to overheating of the IO-PEX40152 card because of modifications I made, or bugs with the DL360p servers and PCIe passthrough.
  2. NFS instead of iSCSI – Because TrueNAS was virtualized inside of the host that was using it for storage, I had to use NFS since the host virtualizing TrueNAS would also be accessing the data on the TrueNAS VM. When shutting down the host, you need to shut down TrueNAS first. NFS disconnects are handled way healthier than iSCSI disconnects (which can cause corruption even if no files are being used).
  3. CPU Cores maxed on data transfer – When doing initial testing, I was maxing out the CPU cores assigned to the TrueNAS VM because the data transfers were so high. I needed a CPU and setup that was better fit.

Version 1 went great, but you can see some things needed to be changed. I decided to go with a dedicated server, not virtualize TrueNAS, and go for a newer CPU with a higher Ghz speed.

And so, version 2 was born (built). Keep reading and scrolling for pictures!

The Hardware

On version 2 of the project, the hardware includes:

Notes on the Hardware:

  • While the ML310e Gen8 v2 server is a cheap low entry server, it’s been a fantastic team member of my homelab.
  • HPE Dual 10G Port 560SFP+ adapters can be found brand new in unsealed boxes on eBay at very attractive prices. Using HPE Parts inside of HPE Servers, avoids the fans from spinning up fast.
  • The ML310e Gen8 v2 has some issues with passing through PCIe cards to ESXi. Works perfect when not passing through.

The new NVMe Storage Server

I decided to repurpose an HPE Proliant ML310e Gen8 v2 Server. This server was originally acting as my Nvidia Grid K1 VDI server, because it supported large PCIe cards. With the addition of my new AMD S7150 x2 hacked in/on to one of my DL360p Gen8’s, I no longer needed the GRID card in this server and decided to repurpose it.

Picture of an HPe ML310e Gen8 v2 with NVMe Storage
HPe ML310e Gen8 v2 with NVMe Storage

I installed the IOCREST IO-PEX40152 card in to the PCIe 16x slot, with 4 x 2TB Sabrent Rocket 4 NVME drives.

Picture of IOCREST IO-PEX40152 with GLOTRENDS M.2 NVMe SSD Heatsink on Sabrent Rocket 4 NVME
IOCREST IO-PEX40152 with GLOTRENDS M.2 NVMe SSD Heatsink on Sabrent Rocket 4 NVME

While the server has a PCIe 16x wide slot, it only has an 8x bus going to the slot. This means we will have half the capable speed vs the true 16x slot. This however does not pose a problem because we’ll be maxing out the 10Gb NICs long before we max out the 8x bus speed.

I also installed an HPE Dual Port 560SFP+ NIC in to the second slot. This will allow a total of 2 x 10Gb network connections from the server to the Ubiquiti UniFi US-16-XG 10Gb network switch, the backbone of my network.

The Server also have 4 x Hot Swappable HD bays on the front. When configured in HBA mode (via the BIOS), these are accessible by TrueNAS and can be used. I plan on populating these with 4 x 4TB HPE MDL SATA Hot Swappable drives to act as a replication destination for the NVMe pool and/or slower magnetic long-term storage.

Front view of HPE ML310e Gen8 v2 with Hotswap Drive bays
HPE ML310e Gen8 v2 with Hotswap Drive bays

I may also try to give WD RED Pro drives a try, but I’m not sure if they will cause the fans to speed up on the server.

TrueNAS Installation and Configuration

For the initial Proof-Of-Concept for version 2, I decided to be quick and dirty and install it to a USB stick. I also waited until I installed TrueNAS on to the USB stick and completed basic configuration before installing the Quad NVMe PCIe card and 10Gb NIC. I’m using a USB 3.0 port on the back of the server for speed, as I can’t verify if the port on the motherboard is USB 2 or USB 3.

Picture of a TrueNAS USB Stick on HPE ML310e Gen8 v2
TrueNAS USB Stick on HPE ML310e Gen8 v2

TrueNAS installation worked without any problems whatsoever on the ML310e. I configured the basic IP, time, accounts, and other generic settings. I then proceeded to install the PCIe cards (storage and networking).

Screenshot of TrueNAS Dashboard Installed on NVMe Storage Server
TrueNAS Installed on NVMe Storage Server

All NVMe drives were recognized, along with the 2 HDDs I had in the front Hot-swap bays (sitting on an HP B120i Controller configured in HBA mode).

Screenshot of available TrueNAS NVMe Disks
TrueNAS NVMe Disks

The 560SFP+ NIC also was detected without any issues and available to configure.

Dashboard Screenshot of TrueNAS 560SFP+ 10Gb NIC
TrueNAS 560SFP+ 10Gb NIC

Storage Configuration

I’ve already done some testing and created a guide on FreeNAS and TrueNAS ZFS Optimizations and Considerations for SSD and NVMe, so I made sure to use what I learned in this version of the project.

I created a striped pool (no redundancy) of all 4 x 2TB NVMe drives. This gave us around 8TB of usable high speed NVMe storage. I also created some datasets and a zVOL for iSCSI.

Screenshot of NVMe TrueNAS Storage Pool with Datasets and zVol
NVMe TrueNAS Storage Pool with Datasets and zVol

I chose to go with the defaults for compression to start with. I will be testing throughput and achievable speeds in the future. You should always test this in every and all custom environments as the results will always vary.

Network Configuration

Initial configuration was done via the 1Gb NIC connection to my main LAN network. I had to change this as the 10Gb NIC will be directly connected to the network backbone and needs to access the LAN and Storage VLANs.

I went ahead and configured a VLAN Interface on VLAN 220 for the Storage network. Connections for iSCSI and NFS will be made on this network as all my ESXi servers have vmknics configured on this VLAN for storage. I also made sure to configure an MTU of 9000 for jumbo frames (packets) to increase performance. Remember that all hosts must have the same MTU to communicate.

Screenshot of 10Gb NIC on Storage VLAN
10Gb NIC on Storage VLAN

Next up, I had to create another VLAN interface for the LAN network. This would be used for management, as well as to provide Windows File Share (SMB/Samba) access to the workstations on the network. We leave the MTU on this adapter as 1500 since that’s what my LAN network is using.

Screenshot of 10Gb NIC on LAN VLAN
10Gb NIC on LAN VLAN

As a note, I had to delete the configuration for the existing management settings (don’t worry, it doesn’t take effect until you hit test) and configure the VLAN interface for my LANs VLAN and IP. I tested the settings, confirmed it was good, and it was all setup.

At this point, only the 10Gb NIC is now being used so I went ahead and disconnected the 1Gb network cable.

Sharing Setup and Configuration

It’s now time to configure the sharing protocols that will be used. As mentioned before, I plan on deploying iSCSI, NFS, and Windows File Shares (SMB/Samba).

iSCSI and NFS Configuration

Normally, for a VMware ESXi virtualization environment, I would always usually prefer iSCSI based storage, however I also wanted to configure NFS to test throughput of both with NVMe flash storage.

Earlier, I created the datasets for all my my NFS exports and a zVOL volume for iSCSI.

Note, that in order to take advantage of the VMware VAAI storage directives (enhancements), you must use a zVOL to present an iSCSI target to an ESXi host.

For NFS, you can simply create a dataset and then export it.

For iSCSI, you need to create a zVol and then configure the iSCSI Target settings and make it available.

SMB (Windows File Shares)

I needed to create a Windows File Share for file based storage from Windows computers. I plan on using the Windows File Share for high-speed storage of files for video editing.

Using the dataset I created earlier, I configured a Windows Share, user accounts, and tested accessing it. Works perfect!

Connecting the host

Connecting the ESXi hosts to the iSCSI targets and the NFS exports is done in the exact same way that you would with any other storage system, so I won’t be including details on that in this post.

We can clearly see the iSCSI target and NFS exports on the ESXi host.

Screenshot of TrueNAS NVMe iSCSI Target on VMware ESXi Host
TrueNAS NVMe iSCSI Target on VMware ESXi Host
Screenshot of NVMe iSCSI and NFS ESXi Datastores
NVMe iSCSI and NFS ESXi Datastores

To access Windows File Shares, we log on and map the network share like you would normally with any file server.

Testing

For testing, I moved (using Storage vMotion) my main VDI desktop to the new NVMe based iSCSI Target LUN on the NVMe Storage Server. After testing iSCSI, I then used Storage vMotion again to move it to the NFS datastore. Please see below for the NVMe storage server speed test results.

Speed Tests

Just to start off, I want to post a screenshot of a few previous benchmarks I compiled when testing and reviewing the Sabrent Rocket 4 NVMe SSD disks installed in my HPE DL360p Gen8 Server and passed through to a VM (Add NVMe capability to an HPE Proliant DL360p Gen8 Server).

Screenshot of CrystalDiskMark testing an IOCREST IO-PEX40152 and Sabrent Rocket 4 NVME SSD for speed
CrystalDiskMark testing an IOCREST IO-PEX40152 and Sabrent Rocket 4 NVME SSD
Screenshot of CrystalDiskMark testing IOPS on an IOCREST IO-PEX40152 and Sabrent Rocket 4 NVME SSD
CrystalDiskMark testing IOPS on an IOCREST IO-PEX40152 and Sabrent Rocket 4 NVME SSD

Note, that when I performed these tests, my CPU was maxed out and limiting the actual throughput. Even then, these are some fairly impressive speeds. Also, these tests were directly testing each NVMe drive individually.

Moving on to the NVMe Storage Server, I decided to test iSCSI NVMe throughput and NFS NVMe throughput.

I opened up CrystalDiskMark and started a generic test, running a 16GB test file a total of 6 times on my VDI VM sitting on the iSCSI NVMe LUN.

Screenshot of NVMe Storage Server iSCSI Benchmark with CrystalDiskMark
NVMe Storage Server iSCSI Benchmark with CrystalDiskMark

You can see some impressive speeds maxing out the 10Gb NIC with crazy performance of the NVME storage:

  • 1196MB/sec READ
  • 1145.28MB/sec WRITE (Maxing out the 10GB NIC)
  • 62,725.10 IOPS READ
  • 42,203.13 IOPS WRITE

Additionally, here’s a screenshot of the ix0 NIC on the TrueNAS system during the speed test benchmark: 1.12 GiB/s.

Screenshot of TrueNAS NVME Maxing out 10Gig NIC
TrueNAS NVME Maxing out 10Gig NIC

And remember this is with compression. I’m really excited to see how I can further tweak and optimize this, and also what increases will come with configuring iSCSI MPIO. I’m also going to try to increase the IOPS to get them closer to what each individual NVMe drive can do.

Now on to NFS, the results were horrible when moving the VM to the NFS Export.

Screenshot of NVMe Storage Server NFS Benchmark with CrystalDiskMark
NVMe Storage Server NFS Benchmark with CrystalDiskMark

You can see that the read speed was impressive, but the write speed was not. This is partly due to how writes are handled with NFS exports.

Clearly iSCSI is the best performing method for ESXi host connectivity to a TrueNAS based NVMe Storage Server. This works perfect because we’ll get the VAAI features (like being able to reclaim space).

Moving Forward

I’ve had this configuration running for around a week now with absolutely no issues, no crashes, and it’s been very stable.

Using a VDI VM on NVMe backed storage is lightning fast and I love the experience.

I plan on running like this for a little while to continue to test the stability of the environment before making more changes and expanding the configuration and usage.

Future Plans (and Configuration)

  • Drive Bays
    • I plan to populate the 4 hot-swappable drive bays with HPE 4TB MDL drives. Configured with RaidZ1, this should give me around 12TB usable storage. I can use this for file storage, backups, replication, and more.
  • NVMe Replication
    • This design was focused on creating non-redundant extremely fast storage. Because I’m limited to a total of 4 NVMe disks in this design, I chose not to use RaidZ and striped the data. If one NVMe drive is lost, all data is lost.
    • I don’t plan on storing anything important, and at this point the storage is only being used for VDI VMs (which are backed up), and Video editing.
    • If I can populate the front drive bays, I can replicate the NVMe storage to the traditional HDD storage on a frequent basis to protect against failure to some level or degree.
  • Version 3 of the NVMe Storage Server
    • More NVMe and Bigger NVMe – I want more storage! I want to test different levels of RaidZ, and connect to the backbone at even faster speeds.
    • NVME Drives with PLP (Power Loss Prevention) for data security and protection.
    • Dual Power Supply

Let me know your thoughts and ideas on this setup!

Oct 182020
 
Screenshot of The Tech Informative Side Chat - HPE Integrated Lights-Out

A new Side Chat Episode of the Tech Informative is now live on YouTube. In this episode we are covering HPE Integrated Lights-Out, also known as HPE iLO.

The Tech Informative Side Chat on HPE Integrated Lights-Out (HPE iLO)

The Tech Informative is a video podcast by Stephen Wagner and Rob Dalton that hopes to explore everyday technologies from the perspective of Information Technology professionals.

Rob Dalton is a lover of IT and a Director by profession. Rob considers himself a jack of all trades, an IT veteran, and is also the author of “Secured Packets”, a technology blog with a focus on security. Rob’s blog can be found at: https://www.securedpackets.com

Stephen Wagner is the President of Digitally Accurate Inc., an IT Solutions and Managed Services company. Stephen is also the author of “The Tech Journal”, an online Technology Blog. Stephen’s blog can be found at: https://www.stephenwagner.com

In this Side Chat, we cover the following:

What is HPE iLO

  • iLO and OneView
  • iLO Amplifier Pack
  • iLO, InfoSight, and HPE Servers

What does iLO do

  • Remote Access
  • Firmware Management
  • Server provisioning (Install OS, Recover, DR, etc.)
  • Features we find most valuable

How is HPE Integrated Lights-Out licensed

Competing Products

  • iDRAC
  • IPMI

We also cover what’s on the next episode of The Tech Informative

Don’t forget to like and subscribe! Leave a comment, feedback, or suggestions in the video comments section.

Jul 072020
 
Picture of a business office with cubicles

In the ever-evolving world of IT and End User Computing (EUC), new technologies and solutions are constantly being developed to decrease costs, improve functionality, and help the business’ bottom line. In this pursuit, as far as end user computing goes, two technologies have emerged: Hosted Desktop Infrastructure (HDI), and Virtual Desktop Infrastructure (VDI). In this post I hope to explain the differences and compare the technologies.

We’re at a point where due to the low cost of backend server computing, performance, and storage, it doesn’t make sense to waste end user hardware and resources. By deploying thin clients, zero clients, or software clients, we can reduce the cost per user for workstations or desktop computers, and consolidate these on the backend side of things. By moving moving EUC to the data center (or server room), we can reduce power requirements, reduce hardware and licensing costs, and take advantage of some cool technologies thanks to the use of virtualization and/or Storage (SANs), snapshots, fancy provisioning, backup and disaster recovery, and others.

See below for the video, or read on for the blog post!

And it doesn’t stop there, utilizing these technologies minimizes the resources required and spent on managing, monitoring, and supporting end user computing. For businesses this is a significant reduction in costs, as well as downtime.

What is Hosted Desktop Infrastructure (HDI) and Virtual Desktop Infrastructure (VDI)

Many IT professionals still don’t fully understand the difference between HDI and VDI, but it’s as sample as this: Hosted Desktop Infrastructure runs natively on the bare metal (whether it’s a server, or SoC) and is controlled and provided by a provisioning server or connection broker, whereas Virtual Desktop Infrastructure virtualizes (like you’re accustomed to with servers) the desktops in a virtual environment and is controlled and provided via hypervisors running on the physical hardware.

Hosted Desktop Infrastructure (HDI)

As mentioned above, Hosted Desktop Infrastructure hosts the End User Computing sessions on bare metal hardware in your datacenter (on servers). A connection broker handles the connections from the thin clients, zero clients, or software clients to the bare metal allowing the end user to see the video display, and interact with the workstation instance via keyboard and mouse.

Pros:

  • Remote Access capabilities
  • Reduction in EUC hardware and cost-savings
  • Simplifies IT Management and Support
  • Reduces downtime
  • Added redundancy
  • Runs on bare metal hardware
  • Resources are dedicated and not shared, the user has full access to the hardware the instance runs on (CPU, Memory, GPU, etc)
  • Easily provide accelerated graphics to EUC instances without additional costs
  • Reduction in licensing as virtualization products don’t need to be used

Cons:

  • Limited instance count to possible instances on hardware
  • Scaling out requires immediate purchase of hardware
  • Some virtualization features are not available since this solution doesn’t use virtualization
  • Additional backup strategy may need to be implemented separate from your virtualized infrastructure

Example:

If you require dedicated resources for end users and want to be as cost-effective as possible, HDI is a great candidate.

An example HDI deployment would utilize HPE Moonshot which is one of the main uses for HPE Moonshot 1500 chassis. HPE Moonshot allows you to provision up to 180 OS instances for each HPE Moonshot 1500 chassis.

More information on the HPE Moonshot (and HPE Edgeline EL4000 Converged Edge System) can be found here: https://www.stephenwagner.com/2018/08/22/hpe-moonshot-the-absolute-definition-of-high-density-software-defined-infrastructure/

Virtual Desktop Infrastructure (VDI)

Virtual Desktop Infrastructure virtualizes the end user operating system instances exactly how you virtualize your server infrastructure. In VMware environments, VMware Horizon View can provision, manage, and maintain the end user computing environments (virtual machines) to dynamically assign, distribute, manage, and broker sessions for users. The software product handles the connections and interaction between the virtualized workstation instances and the thin client, zero client, or software client.

Pros:

  • Remote Access capabilities
  • Reduction in EUC hardware and cost-savings
  • Simplifies IT Management and Support
  • Reduces downtime
  • Added redundancy
  • Runs as a virtual machine
  • Shared resources (you don’t waste hardware or resources as end users share the resources)
  • Easy to scale out (add more backend infrastructure as required, don’t need to “halt” scaling while waiting for equipment)
  • Can over-commit (over-provision)
  • Backup strategy is consistent with your virtualized infrastructure
  • Capabilities such as VMware DRS, VMware HA

Cons:

  • Resources are not dedicated and are shared, users share the server resources (CPU, Memory, GPU, etc)
  • Extra licensing may be required
  • Extra licensing required for virtual accelerated graphics (GPU)

Example:

If you want to share a pool of resources, require high availability, and/or have dynamic requirements then virtualization would be the way to go. You can over commit resources while expanding and growing your environment without any discontinuation of services. With virtualization you also have access to technologies such as DRS, HA, and special Backup and DR capabilities.

An example use case of VMware Horizon View and VDI can be found at: https://www.digitallyaccurate.com/blog/2018/01/23/vdi-use-case-scenario-machine-shops/

 Conclusion

Both technologies are great and have their own use cases depending on your business requirements. Make sure you research and weigh each of the options if you’re considering either technologies. Both are amazing technologies which will compliment and enhance your IT strategy.

Jun 072020
 

This month on June 23rd, HPE is hosting their annual HPE Discover event. This year is a little bit different as COVID-19 has resulted in a change of the usual in-person event, and this year’s event is now being hosted as a virtual experience.

I expect it’ll be the same great content as they have every year, only difference is you’ll be able to virtually experience it from the comfort of your own home.

I’m especially excited to say that I’ve been invited to be special VIP Influencer for the event, so I’ll be posting some content on Twitter, LinkedIn, and of course generating some posts on my blog.

HPE Discover Register Now Graphic
Register for HPE Discover Now

Stay tuned, and don’t forget to register at: https://www.hpe.com/us/en/discover.html

The content catalog is now live!

May 262020
 

So you want to add NVMe storage capability to your HPE Proliant DL360p Gen8 (or other Proliant Gen8 server) and don’t know where to start? Well, I was in the same situation until recently. However, after much research, a little bit of spending, I now have 8TB of NVMe storage in my HPE DL360p Gen8 Server thanks to the IOCREST IO-PEX40152.

Unsupported you say? Well, there are some of us who like to live life dangerously, there is also those of us with really cool homelabs. I like to think I’m the latter.

PLEASE NOTE: This is not a supported configuration. You’re doing this at your own risk. Also, note that consumer/prosumer NVME SSDs do not have PLP (Power Loss Prevention) technology. You should always use supported configurations and enterprise grade NVME SSDs in production environments.

Update – May 2nd 2021: Make sure you check out my other post where I install the IOCREST IO-PEX40152 in an HPE ML310e Gen8 v2 server for Version 2 of my NVMe Storage Server.

DISCLAIMER: If you attempt what I did in this post, you are doing it at your own risk. I won’t be held liable for any damages or issues.

NVMe Storage Server – Use Cases

There’s a number of reasons why you’d want to do this. Some of them include:

  • Server Storage
  • VMware Storage
  • VMware vSAN
  • Virtualized Storage (SDS as example)
  • VDI
  • Flash Cache
  • Special applications (database, high IO)

Adding NVMe capability

Well, after all that research I mentioned at the beginning of the post, I installed an IOCREST IO-PEX40152 inside of an HPE Proliant DL360p Gen8 to add NVMe capabilities to the server.

IOCREST IO-PEX40152 with 4 x 2TB Sabrent Rocket 4 NVME

At first I was concerned about dimensions as technically the card did fit, but technically it didn’t. I bought it anyways, along with 4 X 2TB Sabrent Rocket 4 NVMe SSDs.

The end result?

Picture of an HPE DL360p Gen8 with NVME SSD
HPE DL360p Gen8 with NVME SSD

IMPORTANT: Due to the airflow of the server, I highly recommend disconnecting and removing the fan built in to the IO-PEX40152. The DL360p server will create more than enough airflow and could cause the fan to spin up, generate electricity, and damage the card and NVME SSD.

Also, do not attempt to install the case cover, additional modification is required (see below).

The Fit

Installing the card inside of the PCIe riser was easy, but snug. The metal heatsink actually comes in to contact with the metal on the PCIe riser.

Picture of an IO-PEX40152 installed on DL360p PCIe Riser
IO-PEX40152 installed on DL360p PCIe Riser

You’ll notice how the card just barely fits inside of the 1U server. Some effort needs to be put in to get it installed properly.

Picture of an DL360p Gen8 1U Rack Server with IO-PEX40152 Installed
HPE DL360p Gen8 with IO-PEX40152 Installed

There are ribbon cables (and plastic fittings) directly where the end of the card goes, so you need to gently push these down and push cables to the side where there’s a small amount of thin room available.

We can’t put the case back on… Yet!

Unfortunately, just when I thought I was in the clear, I realized the case of the server cannot be installed. The metal bracket and locking mechanism on the case cover needs the space where a portion of the heatsink goes. Attempting to install this will cause it to hit the card.

Picture of the HPE DL360p Gen8 Case Locking Mechanism
HPE DL360p Gen8 Case Locking Mechanism

The above photo shows the locking mechanism protruding out of the case cover. This will hit the card (with the IOCREST IO-PEX40152 heatsink installed). If the heatsink is removed, the case might gently touch the card in it’s unlocked and recessed position, but from my measurements clears the card when locked fully and fully closed.

I had to come up with a temporary fix while I figure out what to do. Flip the lid and weight it down.

Picture of an HPE DL360p Gen8 case cover upside down
HPE DL360p Gen8 case cover upside down

For stability and other tests, I simply put the case cover on upside down and weighed it down with weights. Cooling is working great and even under high load I haven’t seen the SSD’s go above 38 Celsius.

The plan moving forward was to remove the IO-PEX40152 heatsink, and install individual heatsinks on the NVME SSD as well as the PEX PCIe switch chip. This should clear up enough room for the case cover to be installed properly.

The fix

I went on to Amazon and purchased the following items:

4 x GLOTRENDS M.2 NVMe SSD Heatsink for 2280 M.2 SSD

1 x BNTECHGO 4 Pcs 40mm x 40mm x 11mm Black Aluminum Heat Sink Cooling Fin

They arrived within days with Amazon Prime. I started to install them.

Picture of Installing GLOTRENDS M.2 NVMe SSD Heatsink on Sabrent Rocket 4 NVME
Installing GLOTRENDS M.2 NVMe SSD Heatsink on Sabrent Rocket 4 NVME
Picture of IOCREST IO-PEX40152 with GLOTRENDS M.2 NVMe SSD Heatsink on Sabrent Rocket 4 NVME
IOCREST IO-PEX40152 with GLOTRENDS M.2 NVMe SSD Heatsink on Sabrent Rocket 4 NVME

And now we install it in the DL360p Gen8 PCIe riser and install it in to the server.

You’ll notice it’s a nice fit! I had to compress some of the heat conductive goo on the PFX chip heatsink as the heatsink was slightly too high by 1/16th of an inch. After doing this it fit nicely.

Also, note the one of the cable/ribbon connectors by the SAS connections. I re-routed on of the cables between the SAS connectors they could be folded and lay under the card instead of pushing straight up in to the end of the card.

As I mentioned above, the locking mechanism on the case cover may come in to contact with the bottom of the IOCREST card when it’s in the unlocked and recessed position. With this setup, do not unlock the case or open the case when the server is running/plugged in as it may short the board. I have confirmed when it’s closed and locked, it clears the card. To avoid “accidents” I may come up with a non-conductive cover for the chips it hits (to the left of the fan connector on the card in the image).

And with that, we’ve closed the case on this project…

Picture of a HPE DL360p Gen8 Case Closed
HPE DL360p Gen8 Case Closed

One interesting thing to note is that the NVME SSD are running around 4-6 Celsius cooler post-modification with custom heatsinks than with the stock heatsink. I believe this is due to the awesome airflow achieved in the Proliant DL360 servers.

Conclusion

I’ve been running this configuration for 6 days now stress-testing and it’s been working great. With the server running VMware ESXi 6.5 U3, I am able to passthrough the individual NVME SSD to virtual machines. Best of all, installing this card did not cause the fans to spin up which is often the case when using non-HPE PCIe cards.

This is the perfect mod to add NVME storage to your server, or even try out technology like VMware vSAN. I have a number of cool projects coming up using this that I’m excited to share.

Apr 292020
 
Screenshot of HPE MSA Storage Array Health Check

Are you having issues your HPE MSA SAN? Want to have more insight in to your storage array? Last week, HPE made available a new tool that allows you to check the health of your HPE MSA Storage Array!

While this tool was released to the public last week, rumor has it that this is the same tool that HPE uses internally when providing support to customers.

This tool is FREE to use!

I originally spotted this on the MSA Storage section of the HPE Community forums here: https://community.hpe.com/t5/msa-storage/new-hpe-tool-msa-health-check/td-p/7085594

HPE MSA Array Health Check Video

See below for a video discussing and demonstrating the HPE MSA health Check on an HPE MSA 2040 SAN array.

Accessing the MSA Health Check

The HPE MSA Health Check site can be found at https://msa.ext.hpe.com/MSALogUploader.aspx

The following HPE MSA Arrays are supported:

  • HPE P2000 G3 MSA Array
  • HPE MSA 1040/1050
  • HPE MSA 2040 and variants (MSA 2042)
  • HPE MSA 2050 and variants (MSA 2052)

How to use the MSA Health Check

Using the HPE MSA Health Check is easy!

  1. Log on to your MSA Array SMU (Storage Management Utility)
  2. On the bottom left of the UI, click on the following up-arrow and select save logs
    Save Logs on HPE MSA Array Screenshot
  3. Wait for the logs to generate.
  4. Download the logs to your computer
  5. Open the MSA Storage Array Health Check
    Screenshot of HPE MSA Storage Array Health Check
  6. Click on the “Upload MSA Log File (.zip)” button, and then select your log dump zip file
  7. Wait for the File to upload
    Screenshot of Upload status on HPE MSA Array Log File
  8. View your health report, and optionally download a PDF copy
    Screenshot of a HPE MSA Array Health Check Report

And that’s it!

Available Tests

When running a health check, the following tests and checks are made on the log files:

  • Background Scrub Setting
  • Compact Flash Events
  • Controller Firmware Version Mismatch
  • Controller Partner Firmware Update Setting
  • Default User Check
  • Drive Firmware Version Mismatch
  • Enclosure Firmware Version Mismatch
  • NonSecure Protocols
  • Notification Settings
  • Sparing Best Practices
  • Unhealthy Component Check
  • Volume Mapping

Conclusion

Even if your MSA array is healthy, I’d still recommend generating a log dump and loading it up in to the MSA Health Check. Any extra visibility, is good visibility!

Apr 022020
 
HPE iLO Logo

In response to the COVID19 crises and to help customers and partners, HPE is providing a free iLO Advanced license for your HPE Servers.

These licenses are full licenses that are valid until January 1st, 2021. This means you’ll have the full Integrated Lights-Out product through the end of the 2020 year.

This was announced on the HPE Servers Facebook page, followed by a post on HPE’s blog!

You can watch my video below, or read on for more information!

How to get your free iLO Advanced license

To get your free license, head over to the link below.

https://www.hpe.com/us/en/resources/integrated-systems/ilo-advanced-trial.html

Free iLO Advanced License
Free iLO Advanced License

This link will take you to sign up for a HPE iLO Advanced trial license. After filling out the form, you’ll be able to download your iLO welcome letter, which includes your iLO key (that is valid through 2020), and instructions.

Free HPE iLO Advanced License key, instructions, and expiry date of January 1, 2021.

This is awesome, and will definitely help out a ton of IT administrators this year to remotely manage, monitor, and maintain their servers.

Thanks HPE!

May 182019
 
VMware Horizon View Mobile Client Android Windows 10 VDI Desktop

Since I’ve installed and configured my Nvidia GRID K1, I’ve been wanting to do a graphics quality demo video. I finally had some time to put a demo together.

I wanted to highlight what type of graphics can be achieved in a VDI environment. Even using an old Nvidia GRID K1 card, we can still achieve amazing graphical performance in a virtual desktop environment.

This demo outlines 3D accelerated graphics provided by vGPU.

Demo Video

Please see below for the video:

Information

Demo Specifications

  • VMware Horizon View 7.8
  • NVidia GRID K1
  • GRID vGPU Profile: GRID K180q
  • HPE ML310e Gen8 V2
  • ESXi 6.5 U2
  • Virtual Desktop: Windows 10 Enterprise
  • Game: Steam – Counter-Strike Global Offensive (CS:GO)

Please Note

  • Resolution of the Virtual Desktop is set to 1024×768
  • Blast Extreme is the protocol used
  • Graphics on game are set to max
  • Motion is smooth in person, screen recorder caused some jitter
  • This video was then edited on that VM using CyberLink PowerDirector
  • vGPU is being used on the VM
May 172019
 
Right side of MSA 2040

You may encounter a situation where you’re unable to connect to the management interface or NIC on your HPE MSA array. When this condition occurs, you are not able to ping the NIC, and the SMU (web interface) will not load.

When you visibly look at the array, the AMBER warning light may or may not be flashing.

If you have a dual controller setup, and connect to the SMU on the other controller, you may see numerous log entries where the management NIC port status changes repeatedly from up to down.

What’s happening

I’ve witnessed this issue occur on 2 separate HPE MSA 2040 storage arrays (both with dual controllers).

When you physically look at the management NICs on the controller in question, you’ll notice that the port status LED indicator turns on, and turns off repeatedly. The link status keeps changing from up to down (as reflected in the logs).

The Fix

Restarting the unit will have no effect. Changing the network cable will have no effect.

To resolve this issue, you must play with the network cable and re-seat it a few times (possibly half-way if possible a couple times as sketchy as that sounds).

If you can get the link status up, and disconnect/reconnect the cable before the light turns off, the connection will stay up. It will continue to function and survive restarts until sometime in the future when you disconnect it and reconnect it.

Replacing the controller may also fix it, however in the first instance I observed this, the replacement controller exhibited the same behavior months later in the future.

May 022019
 
Nvidia GRID Logo

I can’t tell you how excited I am that after many years, I’ve finally gotten my hands on and purchased an Nvidia Quadro K1 GPU. This card will be used in my homelab to learn, and demo Nvidia GRID accelerated graphics on VMware Horizon View. In this post I’ll outline the details, installation, configuration, and thoughts. And of course I’ll have plenty of pictures below!

The focus will be to use this card both with vGPU, as well as 3D accelerated vSGA inside in an HPE server running ESXi 6.5 and VMware Horizon View 7.8.

Please Note: As of late (late 2020), hardware h.264 offloading no longer functions with VMware Horizon and VMware BLAST with NVidia Grid K1/K2 cards. More information can be found at https://www.stephenwagner.com/2020/10/10/nvidia-vgpu-grid-k1-k2-no-h264-session-encoding-offload/

Please Note: Some, most, or all of what I’m doing is not officially supported by Nvidia, HPE, and/or VMware. I am simply doing this to learn and demo, and there was a real possibility that it may not have worked since I’m not following the vendor HCL (Hardware Compatibility lists). If you attempt to do this, or something similar, you do so at your own risk.

Nvidia GRID K1 Image

For some time I’ve been trying to source either an Nvidia GRID K1/K2 or an AMD FirePro S7150 to get started with a simple homelab/demo environment. One of the reasons for the time it took was I didn’t want to spend too much on it, especially with the chances it may not even work.

Essentially, I have 3 Servers:

  1. HPE DL360p Gen8 (Dual Proc, 128GB RAM)
  2. HPE DL360p Gen8 (Dual Proc, 128GB RAM)
  3. HPE ML310e Gen8 v2 (Single Proc, 32GB RAM)

For the DL360p servers, while the servers are beefy enough, have enough power (dual redundant power supplies), and resources, unfortunately the PCIe slots are half-height. In order for me to use a dual-height card, I’d need to rig something up to have an eGPU (external GPU) outside of the server.

As for the ML310e, it’s an entry level tower server. While it does support dual-height (dual slot) PCIe cards, it only has a single 350W power supply, misses some fancy server technologies (I’ve had issues with VT-d, etc), and only a single processor. I should be able to install the card, however I’m worried about powering it (it has no 6pin PCIe power connector), and having ESXi be able to use it.

Finally, I was worried about cooling. The GRID K1 and GRID K2 are typically passively cooled and meant to be installed in to rack servers with fans running at jet engine speeds. If I used the DL360p with an external setup, this would cause issues. If I used the ML310e internally, I had significant doubts that cooling would be enough. The ML310e did have the plastic air baffles, but only had one fan for the expansion cards area, and of course not all the air would pass through the GRID K1 card.

The Purchase

Because of a limited budget, and the possibility I may not even be able to get it working, I didn’t want to spend too much. I found an eBay user local in my city who had a couple Grid K1 and Grid K2 cards, as well as a bunch of other cool stuff.

We spoke and he decided to give me a wicked deal on the Grid K1 card. I thought this was a fantastic idea as the power requirements were significantly less (more likely to work on the ML310e) on the K1 card at 130 W max power, versus the K2 card at 225 W max power.

NVIDIA GRID K1 and K2 Specifications
NVIDIA GRID K1 and K2 Specification Table

The above chart is a capture from:
https://www.nvidia.com/content/cloud-computing/pdf/nvidia-grid-datasheet-k1-k2.pdf

We set a time and a place to meet. Preemptively I ran out to a local supply store to purchase an LP4 power adapter splitter, as well as a LP4 to 6pin PCIe power adapter. There were no available power connectors inside of the ML310e server so this was needed. I still thought the chances of this working were slim…

These are the adapters I purchased:

Preparation and Software Installation

I also decided to go ahead and download the Nvidia GRID Software Package. This includes the release notes, user guide, ESXi vib driver (includes vSGA, vGPU), as well as guest drivers for vGPU and pass through. The package also includes the GRID vGPU Manager. The driver I used was from:
https://www.nvidia.com/Download/driverResults.aspx/144909/en-us

To install, I copied over the vib file “NVIDIA-vGPU-kepler-VMware_ESXi_6.5_Host_Driver_367.130-1OEM.650.0.0.4598673.vib” to a datastore, enabled SSH, and then ran the following command to install:

esxcli software vib install -v /path/to/file/NVIDIA-vGPU-kepler-VMware_ESXi_6.5_Host_Driver_367.130-1OEM.650.0.0.4598673.vib

The command completed successfully and I shut down the host. Now I waited to meet.

We finally met and the transaction went smooth in a parking lot (people were staring at us as I handed him cash, and he handed me a big brick of something folded inside of grey static wrap). The card looked like it was in beautiful shape, and we had a good but brief chat. I’ll definitely be purchasing some more hardware from him.

Hardware Installation

Installing the card in the ML310e was difficult and took some time with care. First I had to remove the plastic air baffle. Then I had issues getting it inside of the case as the back bracket was 1cm too long to be able to put the card in. I had to finesse and slide in on and angle but finally got it installed. The back bracket (front side of case) on the other side slid in to the blue plastic case bracket. This was nice as the ML310e was designed for extremely long PCIe expansion cards and has a bracket on the front side of the case to help support and hold the card up as well.

For power I disconnected the DVD-ROM (who uses those anyways, right?), and connected the LP5 splitter and the LP5 to 6pin power adapter. I finally hooked it up to the card.

I laid the cables out nicely and then re-installed the air baffle. Everything was snug and tight.

Please see below for pictures of the Nvidia GRID K1 installed in the ML310e Gen8 V2.

Host Configuration

Powering on the server was a tense moment for me. A few things could have happened:

  1. Server won’t power on
  2. Server would power on but hang & report health alert
  3. Nvidia GRID card could overheat
  4. Nvidia GRID card could overheat and become damaged
  5. Nvidia GRID card could overheat and catch fire
  6. Server would boot but not recognize the card
  7. Server would boot, recognize the card, but not work
  8. Server would boot, recognize the card, and work

With great suspense, the server powered on as per normal. No errors or health alerts were presented.

I logged in to iLo on the server, and watched the server perform a BIOS POST, and start it’s boot to ESXi. Everything was looking well and normal.

After ESXi booted, and the server came online in vCenter. I went to the server and confirmed the GRID K1 was detected. I went ahead and configured 2 GPUs for vGPU, and 2 GPUs for 3D vSGA.

ESXi Graphics Settings for Host Graphics and Graphics Devices
ESXi Host Graphics Devices Settings

VM Configuration

I restarted the X.org service (required when changing the options above), and proceeded to add a vGPU to a virtual machine I already had configured and was using for VDI. You do this by adding a “Shared PCI Device”, selecting “NVIDIA GRID vGPU”, and I chose to use the highest profile available on the K1 card called “grid_k180q”.

Virtual Machine Edit Settings with NVIDIA GRID vGPU and grid_k180q profile selected
VM Settings to add NVIDIA GRID vGPU

After adding and selecting ok, you should see a warning telling you that must allocate and reserve all resources for the virtual machine, click “ok” and continue.

Power On and Testing

I went ahead and powered on the VM. I used the vSphere VM console to install the Nvidia GRID driver package (included in the driver ZIP file downloaded earlier) on the guest. I then restarted the guest.

After restarting, I logged in via Horizon, and could instantly tell it was working. Next step was to disable the VMware vSGA Display Adapter in the “Device Manager” and restart the host again.

Upon restarting again, to see if I had full 3D acceleration, I opened DirectX diagnostics by clicking on “Start” -> “Run” -> “dxdiag”.

DirectX Diagnostic Tool (dxdiag) showing Nvidia Grid K1 on VMware Horizon using vGPU k180q profile
dxdiag on GRID K1 using k180q profile

It worked! Now it was time to check the temperature of the card to make sure nothing was overheating. I enabled SSH on the ESXi host, logged in, and ran the “nvidia-smi” command.

nvidia-smi command on ESXi host showing GRID K1 information, vGPU information, temperatures, and power usage
“nvidia-smi” command on ESXi Host

According to this, the different GPUs ranged from 33C to 50C which was PERFECT! Further testing under stress, and I haven’t gotten a core to go above 56. The ML310e still has an option in the BIOS to increase fan speed, which I may test in the future if the temps get higher.

With “nvidia-smi” you can see the 4 GPUs, power usage, temperatures, memory usage, GPU utilization, and processes. This is the main GPU manager for the card. There are some other flags you can use for relevant information.

nvidia-smi with vgpu flag for vgpu information
“nvidia-smi vgpu” for vGPU Information
nvidia-smi with vgpu -q flag
“nvidia-smi vgpu -q” to Query more vGPU Information

Final Thoughts

Overall I’m very impressed, and it’s working great. While I haven’t tested any games, it’s working perfect for videos, music, YouTube, and multi-monitor support on my 10ZiG 5948qv. I’m using 2 displays with both running at 1920×1080 for resolution.

I’m looking forward to doing some tests with this VM while continuing to use vGPU. I will also be doing some testing utilizing 3D Accelerated vSGA.

The two coolest parts of this project are:

  • 3D Acceleration and Hardware h.264 Encoding on VMware Horizon
  • Getting a GRID K1 working on an HPE ML310e Gen8 v2

Highly recommend getting a setup like this for your own homelab!

Uses and Projects

Well, I’m writing this “Uses and Projects” section after I wrote the original article (it’s now March 8th, 2020). I have to say I couldn’t be impressed more with this setup, using it as my daily driver.

Since I’ve set this up, I’ve used it remotely while on airplanes, working while travelling, even for video editing.

Some of the projects (and posts) I’ve done, can be found here:

Leave a comment and let me know what you think! Or leave a question!