I notice quite a bit of traffic coming in (alot of it is the same people coming back) searching for information on VMware vSphere using iSCSI, specifically Lio-Target (because of the compatibility with SCSI persistant reservations).
In the past I’ve been jumping all over the place testing Lio-Target on different distrobutions, test scenarios, etc… I’ve officially implemented it into my production network and just wanted to report it’s been running solid for a few months now.
Current Working Configuration (Stable):
Currently at the moment I have numerous HP Servers (ML350s, DL360s) running ESXi off an internal usb-key. These ESXi hosts are accesing numerous iSCSI targets over gigabit hosted on a DL360 G6 with 2 X MSA20 storage units. The server hosting the storage is running Ubuntu 10.10 and has been rock solid with absolutely no issues. I’m fully utilizing VMotion amongst all the hosts, and all hosts have concurrent access to the iSCSI targets. This is running in full production and I’m fully confident in the configuration and setup.
Future Plans:
Lio-Target is going upstream in to the Linux kernel on the next release (2.6.38). With the testing I did (and blogged about) in the past months, I have not been able to get the newer versions of Lio-target running stable on CentOS. Once a new version of CentOS is released, or there is a kernel upgrade available to bring CentOS to 2.6.38, I will be installing CentOS on to the storage server and adding more disk space. Once this change is complete, that will conclude any future changes for a while (excluding adding more ESXi hosts).
If anyone has any questions on my setup, or configuration with something similar, or have any issues with Lio-Target, please feel free to leave a comment and I’ll see if I can help!
While most of us frequently deploy new ESXi hosts, a question and task not oftenly discussed is how to properly decommission a VMware ESXi host. Some might be surprised to… Read More
This guide will outline the instructions to Disable the VMware Horizon Session Bar. These instructions can be used to disable the Horizon Session Bar (also known as the Horizon Client… Read More
Normally, any VMs that are NVIDIA vGPU enabled have to be manually migrated with manual vMotion if a host is placed in to maintenance mode, to evacuate the host. While… Read More
You may experience GPU issues with the VMware Horizon Indirect Display Driver in your environment when using 3rd party applications which incorrectly utilize the incorrect display adapter. This results with… Read More
Today we're going to cover a powerful little NAS being used with VMware; the Synology DS923+ VMware vSphere Use case and Configuration. This little (but powerful) NAS is perfect for… Read More
Today we'll go over how to install the vSphere vCenter Root Certificate on your client system. Certificates are designed to verify the identity of the systems, software, and/or resources we… Read More
View Comments
I've also been following your articles and implementing a similar setup. Great articles! LIO iSCSI performance is fantastic and very stable on Ubuntu 10.10. I'm now trying to add DRBD and HA into the mix for a fully replicated storage scenario. So far, this has been difficult so I was wondering if you had tried anything similar? The problem seems to revolve around HA (Corosync/Pacemaker) and the outdated iSCSITarget and iSCSILogicalUnit scripts that are included. These scripts only support block_io, and in this instance I'd prefer to use file_io. The scripts will need to be modified so I'd thought I'd check to see if anyone had already "invented the wheel" before I attemped it.
The LIO wiki also talks about using the built-in md softraid to replicate between two iSCSI datastores instead of using DRBD. This would work as well, but still runs into the same problem with HA.
I've also diverted to other options such as GlusterFS and Lustre, and then adding LIO on top of them. All of them have the same HA problem.
Anyway, if you've discovered any solutions or you're interested in collaborating, let me know!
Thanks,
Hi Wade,
Thanks!
Unfortunately as of yet, I haven't tried touching base on anything in regards to DRBD or HA... It has been on my list of things "to-do" but I've been busy! lol
Maybe someone else who follows these posts can contribute something. Once I get a test environment setup I'll let you know how things go, hopefully it'll be sometime in the next couple weeks.
From what I know (preliminary research), I think I know may why it would only work with Block IO, but I'm not saying anything until I actually know what I'm talking about.
Since I've started playing with Lio, I've been in touch with the guys who developed it, so maybe they will have some recommendations as to how to achieve what we are both thinking.
I'll keep you posted! and if you find anything in the meantime, let us know :)
Thanks,
Stephen
Could you throw together a quick howto on compiling an Ubuntu kernel with lio?
Gerald
Hi Gerald,
If you actually take a look at my blog, there's another thread titled something like "Getting Lio-Target running on CentOS".
The procedure is almost identical, with a few exceptions...
1) You don't need to take the step of "Use depreciated SysFS", since Ubuntu doesn't use it.
2) With ubuntu you have to use their own command to generate a initial ramdisk (Initrd). The command would be "update-initramfs -c -k kernel-ver". In an example: "update-initramfs -c -k 2.6.34-iscsi"
3) After you have the kernel and initial ramdisk, you have to generate a GRUB2 bootloader config by issuing, "update-grub".
Hope this helps...
Stephen
LIO Target is absolutely glorious! I'm surprised it did not went mainstream. At least yet :)
DBRB-based HA is a poor mans HA really... Active-Stand-By HA model is not going to shine from both performance and fast node switch points of view. Companies using DBRB as a backbone for their HA solutions (like for example Open-E) tend to install third "pinging" node to workaround this issue. But they can do nothing with limited performance shared IP cluster provides.
Anton Kolomyeytsev
Proud member of iSCSI SAN community.
Thanks!
I have a working kernel under Ubuntu 10.10.
I downloaded lio_utils, and I get a failure on /etc/init.d/target start:
Loading target_core_mod/ConfigFS core: [OK]
Calling ConfigFS script /etc/target/tcm_start.sh for target_core_mod: [FAILED]
Calling ConfigFS script /etc/target/lio_start.sh for iscsi_target_mod: [OK]
So I downgraded to lio_utils 3.0 (after cleaning up everything I could), and that also failed.
Any ideas?
Here's the actual error:
tcm_node --block iblock_1/array /dev/md0
ConfigFS HBA: iblock_1
Successfully added TCM/ConfigFS HBA: iblock_1
ConfigFS Device Alias: array
Device Params ['/dev/md0']
IBLOCK: createvirtdev failed for enable_opt with echo 1 > /sys/kernel/config/target/core/iblock_1/array/enable
Unable to register TCM/ConfigFS storage object: /sys/kernel/config/target/core/iblock_1/array
Just curious, did you build Lio as kernel modules, or did you build it in to the kernel?
Sounds like it can't find the module, or it's trying to load it but there might be a kernel mismatch...
Could you post the last section of your "dmesg" that coincides with the /etc/init.d/target start ?
lsmod shows me:
iscsi_target_mod 265970 6
target_core_mod 318058 9 iscsi_target_mod
configfs 27505 3 iscsi_target_mod,target_core_mod
Mar 11 13:06:28 openfiler kernel: [ 289.736936] TARGET_CORE[0]: Loading Generic Kernel Storage Engine: v3.4.0 on Linux/x86_64 on 2.6.34
Mar 11 13:06:28 openfiler kernel: [ 289.737074] TARGET_CORE[0]: Initialized ConfigFS Fabric Infrastructure: v3.4.0 on Linux/x86_64 on 2.6.34
Mar 11 13:06:28 openfiler kernel: [ 289.737077] SE_PC[0] - Registered Plugin Class: TRANSPORT
Mar 11 13:06:28 openfiler kernel: [ 289.737080] PLUGIN_TRANSPORT[1] - pscsi registered
Mar 11 13:06:28 openfiler kernel: [ 289.737082] PLUGIN_TRANSPORT[4] - iblock registered
Mar 11 13:06:28 openfiler kernel: [ 289.737083] PLUGIN_TRANSPORT[5] - rd_dr registered
Mar 11 13:06:28 openfiler kernel: [ 289.737085] PLUGIN_TRANSPORT[6] - rd_mcp registered
Mar 11 13:06:28 openfiler kernel: [ 289.737086] PLUGIN_TRANSPORT[7] - fileio registered
Mar 11 13:06:28 openfiler kernel: [ 289.737088] SE_PC[1] - Registered Plugin Class: OBJ
Mar 11 13:06:28 openfiler kernel: [ 289.737090] PLUGIN_OBJ[1] - dev registered
Mar 11 13:06:28 openfiler kernel: [ 289.737093] CORE_HBA[0] - TCM Ramdisk HBA Driver v3.1 on Generic Target Core Stack v3.4.0
Mar 11 13:06:28 openfiler kernel: [ 289.737095] CORE_HBA[0] - Attached Ramdisk HBA: 0 to Generic Target Core TCQ Depth: 256 MaxSectors: 1024
Mar 11 13:06:28 openfiler kernel: [ 289.737097] CORE_HBA[0] - Attached HBA to Generic Target Core
Mar 11 13:06:28 openfiler kernel: [ 289.737100] RAMDISK: Referencing Page Count: 8
Mar 11 13:06:28 openfiler kernel: [ 289.737104] CORE_RD[0] - Built Ramdisk Device ID: 0 space of 8 pages in 1 tables
Mar 11 13:06:28 openfiler kernel: [ 289.737107] rd_dr: Using SPC_PASSTHROUGH, no reservation emulation
Mar 11 13:06:28 openfiler kernel: [ 289.737109] rd_dr: Using SPC_ALUA_PASSTHROUGH, no ALUA emulation
Mar 11 13:06:28 openfiler kernel: [ 289.737111] CORE_RD[0] - Activating Device with TCQ: 0 at Ramdisk Device ID: 0
Mar 11 13:06:28 openfiler kernel: [ 289.737204] Vendor: LIO-ORG Model: RAMDISK-DR Revision: 3.1
Mar 11 13:06:28 openfiler kernel: [ 289.737214] Type: Direct-Access ANSI SCSI revision: 05
Mar 11 13:06:28 openfiler kernel: [ 289.737231] T10 VPD Unit Serial Number: 1234567890:0_0
Mar 11 13:06:28 openfiler kernel: [ 289.737244] T10 VPD Page Length: 38
Mar 11 13:06:28 openfiler kernel: [ 289.737246] T10 VPD Identifer Length: 34
Mar 11 13:06:28 openfiler kernel: [ 289.737248] T10 VPD Identifier Association: addressed logical unit
Mar 11 13:06:28 openfiler kernel: [ 289.737249] T10 VPD Identifier Type: T10 Vendor ID based
Mar 11 13:06:28 openfiler kernel: [ 289.737251] T10 VPD ASCII Device Identifier: LIO-ORG
Mar 11 13:06:28 openfiler kernel: [ 289.737266] CORE_RD[0] - Added TCM DIRECT Ramdisk Device ID: 0 of 8 pages in 1 tables, 32768 total bytes
Mar 11 13:06:29 openfiler kernel: [ 290.789980] Target_Core_ConfigFS: Located se_plugin: ffff88011474e0e0 plugin_name: iblock hba_type: 4 plugin_dep_id: 1
Mar 11 13:06:29 openfiler kernel: [ 290.789986] CORE_HBA[0] - TCM iBlock HBA Driver 3.1 on Generic Target Core Stack v3.4.0
Mar 11 13:06:29 openfiler kernel: [ 290.789988] CORE_HBA[0] - Attached iBlock HBA: 1 to Generic Target Core TCQ Depth: 512
Mar 11 13:06:29 openfiler kernel: [ 290.789990] CORE_HBA[1] - Attached HBA to Generic Target Core
Mar 11 13:06:29 openfiler kernel: [ 290.790025] IBLOCK: Allocated ib_dev for array
Mar 11 13:06:29 openfiler kernel: [ 290.790029] Target_Core_ConfigFS: Allocated se_subsystem_dev_t: ffff88010765b000 se_dev_su_ptr: ffff8801070d0000
Mar 11 13:06:29 openfiler kernel: [ 290.793978] Target_Core_ConfigFS: iblock_1/array set udev_path: /dev/md0
Mar 11 13:06:29 openfiler kernel: [ 290.794815] IBLOCK: Referencing UDEV path: /dev/md0
Mar 11 13:06:29 openfiler kernel: [ 290.795638] Missing iblock_major= and iblock_minor= parameters
Mar 11 13:06:29 openfiler kernel: [ 290.796112] Target_Core_ConfigFS: Calling t->free_device() for se_dev_su_ptr: ffff8801070d0000
Mar 11 13:06:29 openfiler kernel: [ 290.796115] Target_Core_ConfigFS: Deallocating se_subsystem_dev_t: ffff88010765b000
Mar 11 13:06:29 openfiler kernel: [ 290.843195] Target_Core_ConfigFS: REGISTER -> group: ffffffffa065a400 name: iscsi
Mar 11 13:06:29 openfiler kernel: [ 290.854288] Linux-iSCSI.org iSCSI Target Core Stack v3.4.0 on Linux/x86_64 on 2.6.34
Mar 11 13:06:29 openfiler kernel: [ 290.854312] <<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>
Mar 11 13:06:29 openfiler kernel: [ 290.854315] Initialized struct target_fabric_configfs: ffff88010717a000 for iscsi
Mar 11 13:06:29 openfiler kernel: [ 290.854318] <<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>
Mar 11 13:06:29 openfiler kernel: [ 290.854319] LIO_TARGET[0] - Set fabric -> lio_target_fabric_configfs
Mar 11 13:06:29 openfiler kernel: [ 290.854455] Spawned 4 thread set(s) (8 total threads).
Mar 11 13:06:29 openfiler kernel: [ 290.854615] TARGET_CORE[iSCSI]: Allocated Discovery se_portal_group_t for endpoint: None, Portal Tag: 1
Mar 11 13:06:29 openfiler kernel: [ 290.854626] CORE[0] - Allocated Discovery TPG
Mar 11 13:06:29 openfiler kernel: [ 290.854628] Loading Complete.
Mar 11 13:06:29 openfiler kernel: [ 290.855009] Target_Core_ConfigFS: REGISTER -> Located fabric: iscsi
Mar 11 13:06:29 openfiler kernel: [ 290.855011] Target_Core_ConfigFS: REGISTER -> ffffffffa073b980
Mar 11 13:06:29 openfiler kernel: [ 290.855014] Target_Core_ConfigFS: REGISTER -> Allocated Fabric: iscsi
Mar 11 13:06:29 openfiler kernel: [ 290.855015] Target_Core_ConfigFS: REGISTER -> Set tf->tf_fabric for iscsi
Mar 11 13:06:29 openfiler kernel: [ 290.855041] lio_target_call_coreaddtiqn(): name: iqn.2011.com.norscan.iscsi:array
Mar 11 13:06:29 openfiler kernel: [ 290.855044] CORE[0] - Added iSCSI Target IQN: iqn.2011.com.norscan.iscsi:array
Mar 11 13:06:29 openfiler kernel: [ 290.855045] LIO_Target_ConfigFS: REGISTER -> iqn.2011.com.norscan.iscsi:array
Mar 11 13:06:29 openfiler kernel: [ 290.855047] LIO_Target_ConfigFS: REGISTER -> Allocated Node: iqn.2011.com.norscan.iscsi:array
Mar 11 13:06:29 openfiler kernel: [ 290.855053] lio_target_tiqn_addtpg() parent name: iqn.2011.com.norscan.iscsi:array
Mar 11 13:06:29 openfiler kernel: [ 290.855077] TARGET_CORE[iSCSI]: Allocated Normal se_portal_group_t for endpoint: iqn.2011.com.norscan.iscsi:array, Portal Tag: 1
Mar 11 13:06:29 openfiler kernel: [ 290.855091] CORE[iqn.2011.com.norscan.iscsi:array]_TPG[1] - Added iSCSI Target Portal Group
Mar 11 13:06:29 openfiler kernel: [ 290.855093] LIO_Target_ConfigFS: REGISTER -> iqn.2011.com.norscan.iscsi:array
Mar 11 13:06:29 openfiler kernel: [ 290.855094] LIO_Target_ConfigFS: REGISTER -> Allocated TPG: tpgt_1
Mar 11 13:06:29 openfiler kernel: [ 290.855116] LIO_Target_ConfigFS: REGISTER -> iqn.2011.com.norscan.iscsi:array TPGT: 1 LUN: 0
Mar 11 13:06:29 openfiler kernel: [ 290.857092] LIO_Target_ConfigFS: DEREGISTER -> iqn.2011.com.norscan.iscsi:array TPGT: 1 LUN: 0
Mar 11 13:06:29 openfiler kernel: [ 290.900160] LIO_Target_ConfigFS: REGISTER -> iqn.2011.com.norscan.iscsi:array TPGT: 1 PORTAL: 192.168.3.100:3260
Mar 11 13:06:29 openfiler kernel: [ 290.900215] CORE[0] - Added Network Portal: 192.168.3.100:3260 on TCP on network device: None
Mar 11 13:06:29 openfiler kernel: [ 290.900220] CORE[iqn.2011.com.norscan.iscsi:array] - Added Network Portal: 192.168.3.100:3260,1 on TCP on network device: None
Mar 11 13:06:29 openfiler kernel: [ 290.900222] CORE[iqn.2011.com.norscan.iscsi:array]_TPG[1] - Incremented np_exports to 1
Mar 11 13:06:29 openfiler kernel: [ 290.900224] LIO_Target_ConfigFS: addnptotpg done!
Mar 11 13:06:29 openfiler kernel: [ 290.941976] Disabling iSCSI Authentication Methods for TPG: 1.
Mar 11 13:06:29 openfiler kernel: [ 290.983976] iSCSI_TPG[1] - Added ACL with TCQ Depth: 16 for iSCSI Initiator Node: iqn.2011-03.com.example:9031b9c2
Mar 11 13:06:29 openfiler kernel: [ 290.983981] LIO_Target_ConfigFS: REGISTER -> iqn.2011.com.norscan.iscsi:array TPGT: 1 Initiator: iqn.2011-03.com.example:9031b9c2 CmdSN Depth: 16
Mar 11 13:06:29 openfiler kernel: [ 290.984007] LIO_Target_ConfigFS: Initialized Initiator LUN ACL: iqn.2011-03.com.example:9031b9c2 Mapped LUN: lun_0
Mar 11 13:06:29 openfiler kernel: [ 291.027133] iSCSI_TPG[1] - Enabled iSCSI Target Portal Group
Apparently, it errors out when I reference /dev/md0 in either block or fileio. If I fileio to a file, it works.
Thanks for your help! I had to go Ubuntu 10.04 and backports 3.5. Essentially the following:
aptitude install git-core build-essential libsnmp-dev
git clone git://risingtidesystems.com/lio-core-backports.git lio-core-backports.git
cd lio-core-backports.git
make
make install
cd ..
git clone git://git.kernel.org/pub/scm/linux/storage/lio/lio-utils.git lio-utils.git
cd lio-utils.git
git checkout --track -b lio-3.5 origin/lio-3.5
make
make install
What kind of speed are you getting? Using fileio I get 12 MB/s writes and 60 MB/s reads. I'm sure there are some tweeks I can do, but documentation is thin.
What is your opinion of write-back vs write-through on a UPS protected/monitored LIO target? Do you know how to turn on write-back so I can test it?
Thanks,
Gerald
I think a new change might have been implemented in one of the newer versions, or a change hasn't been made, but lio-utils is calling the wrong configuration name space...
I'll keep you posted if I find anything...
As for write caching, and speeds. I have two HP MSA20 Modular Smart Array's hooked up to my storage box. One has 12 X 250GB drives configured in Raid 5. The second MSA20 is running 5 X 500 GB in Raid 5.
I'm accessing the drives via block IO. The MSA20's have all write caching enabled since all applicable RAID cards, and the controller cards inside of the MSA20s have battery backed cache. I don't know the exact read/write speeds (since VMware is accessing the iSCSI targets), but I'm hitting 120MB/sec when doing things like storage vMotion, etc... I believe I could hit higher speeds, but I'm capped due to network bandwidth.
Keep in mind, I'm running a gigabit network, and my ESXi hosts, and my storage box itself are tanks (multi-core DL360 G5s, multi-core ML350 G5s, plenty of ram (even inside of the SAN server, etc...).
I'm on GigE as well. I modified the LIO 3.5 source code to allow write-back with fileio, and my speeds went up to 40 MB/s. That's still half of EIT. Nicholas Bellinger gave me a tip on speeding thing sup, so I'll be trying that today.
Let me know how it turns out.
PS, 2.6.38 was released yesterday. I havn't confirmed this, but Lio-Target was merged with the kernel source for the release candidates. Since this is "mainline and stable", I'm looking forward to giving it a try.
I'm hoping to get 2.6.38 setup this weekend if I have time (there's never enough time lol). I'm not sure if I'm going to try it on CentOS, but I'll probably get it runnin on the latest Ubuntu.
Sorry for the delay in responding.
Unfortunately, my LIO test machine locked up twice during load testing, with no log file data. I switched back to IET, and the lock ups stopped.
I tried LIO in order to test mulipathing, and although I got balanced load across 3 GigE cards, I saw no speed increase in throughput.
Looks like I'm staying with IET for awhile yet.