Feb 202013
 

Recently it was time to refresh a client’s disaster recovery solution. We were getting ready to release our dependance on our 5 year old HP MSL2024 with an LTO-4 tape drive, and implement a new HP MSL2024 library with a SAS LTO-6 tape drive. We need to use tape since the size of the backup requirements for a full back up are over 6TB.

The server that is connected to all this equipment is an HP Proliant DL360 G6 with a HP Smart Array P800 Controller. The P800 already has an HP StorageWorks MSA60 unit attached to it with 12 drive

Documentation for the P800 mentioned tape drive support. While I know that the P800 is only capable of 3Gb/sec, this is more that enough and chances are the hard drive will be maxed out reading anyways.

Anyways, client approved purchase, brought in the hardware and installed it. First we had to install Backup Exec 2012 (since only the 2012 SP1a HCL specifies support for LTO-6), which was messy but we did it. Then we re-configured all of our backup jobs, since the old jobs were migrated horribly.

When trying to run our first backup, the backup failed. I tried again numerous times, only to get these errors:

  • Storage device “HP 07” reported an error on a request to rewind the media.
  • Final error: 0xe00084f0 – The device timed out.
  • Storage device “HP 07” reported an error on a request to write data to media.
  • Storage device “HP 6” reported an error on a request to write data to media.
  • PvlDrive::DisableAccess() – ReserveDevice failed, offline device
  • ERROR = 0x0000001F (ERROR_GEN_FAILURE)

Also, every time the backup would fail, the Library and the Tape drive would disappear from the computers “Device Manager”. Essentially the device would lose it’s connection. Even when logging in to the HP MSL2024 web interface, it would state the SAS port is disconnected after a backup job would fail. To resolve this, you’d have to restart the library and restart the Backup Exec services. One interesting thing, when this occurred, my companies monitoring and management software would report a RAID failure had occured at the customers site, until the MSL was restarted (this was kinda cool).

 

I immediately called HP support. They mentioned the library had a firmware up 5.80 and asked to try to update. We did and it failed since the firmware file didn’t match it’s checksum, I was told that this is not important as 5.90 doesn’t contain any major changes. We continued to spend 6 hours on the phone trying to disable insight agents, check drivers, etc… Finally he decided to replace the tape drive.

Since LTO-6 is brand new technology, even with a 4 hour response contract, it took HP around 2 weeks to replace the drive since none were in Canada. During this time, I called two other separate times. The second tech told me that at the moment, no HP controllers support the HP LTO-6 tape drives (you’re kidding me right?), and the 3rd said he couldn’t provide me any information as there’s nothing in the documentation that specifies what controllers were compatible. All 3 tech’s mentioned that having the P800 controller in the server host both the MSA60 and the MSL2024 is probably causing the issues.

We received the new tape drive, tested, and the backups failed. I sent the drive back (which was a repaired unit, and kept the original brand new one). After this I tried numerous things, google’d for days. Finally I was just about to quote the client a new controller card, when I finally decided to give HP another call.

On this call, he escalated the issue to engineers. Later that night I received an e-mail stating that library firmware 5.90 is required for support for the LTO-6 tape drives. I was shocked, angry, etc… It turns out that library firmware 5.80 was “Recalled” due to major issues a while back.

Since LTT couldn’t load the firmware, I just downloaded it manually and flashed it via the MSL 2024 web interface. After this restarted the Backup Exec services, performed an inventory, and did a minor backup (around 130GB). Keep in mind that when the backups originally failed, it didn’t matter the size, the backup would simply fail just before it completed.

The backup completed! Later on that night I ran a full complete backup of 5TB (2 servers and 2 MSA60s) and it completely 100% successfully. Even with the MSA60 under extreme load maxing out the drives, this did not in any way impede performance of the LTO-6 tape drive/library.

 

So please, if you’re having this issue consider the following:

1. Tape library must be at firmware version 5.90 to support LTO-6 Tape drives. Always always always make sure you have the latest firmware.

2. I have a working configuration of a P800 controlling both an HP MSA60, and a HP MSL 2024 backup library and it’s working 100%

3. Make sure you have Backup Exec 2012 SP1a installed as it’s required for LTO-6 compatibility (make sure you read about the major changes upgrading to 2012 first, I can’t stress this enough!!!)

 

I hope this helps some of you out there as this was consuming my life for numerous weeks.

Nov 222012
 

Just something I wanted to share in case anyone else ran in to this issue…

At a specific client we have 2 X MSA60 units attached via Smart Array P800 controllers to 2 X DL360 G6 servers. These combo of server, controller, and storage units were purchased just after they were originally released from HP.

I’m writing about a specific condition in which after a drive fails in RAID 5, during rebuild, numerous (and I mean over 70,000) event log entries in the event viewer state: “Surface analysis has repaired an inconsistent stripe on logical drive 1 connected to array controller P800 located in server slot 2. This repair was conducted by updating the parity data to match the data drive contents.”

 

One one of these arrays, shortly after a successful rebuild while the event viewer was spitting these errors out, had another drive fail. At this point the RAID array went offline, and the entire RAID array and all it’s contents were unrecoverable. Keep in mind this occurred after the rebuild, while a surface scan was in progress. In this specific case we rebuilt the array, restored from backup and all was good. After mentioning this to HP support techs, they said it was safe to ignore these messages as they were fine and informational (I didn’t feel this was the case). After creating the new RAID array on this specific unit, we never saw these messages on that unit again.

On the other MSA60 unit however, we regularly received these messages (we always keep the firmware of the MSA60 unit, and the P800 controller up to date). Again numerous times asked HP support and they said we could safely ignore these. Recently, during a power outage, the P800 controller flagged it’s cache batteries as failed, at the same time a drive failed and we were yet again presented with these errors after the rebuild. After getting the drive replaced, I contacted HP again, and finally insisted that they investigate this issue regarding the event log errors. This specific time, new errors about parity were presenting themselves in the event viewer.

After being put on hold for some time, they came back and mentioned that these errors are probably caused because the RAID array was created with a very early firmware version. They recommended to delete the logical array, and re-create it with the latest firmware to avoid any data loss. I specifically asked if there was a chance that the array could fail due to these errors, and the fact it was created with an early firmware version, and they confirmed it. I went ahead, created backups, deleted the array and re-created it, restored the back and the errors are no longer present.

 

I just wanted to create this blog post, as I see numerous people are searching for the meaning of these errors, and wanted to shed some light and maybe help a few of you out, to help you avoid any future catastrophic problems!

Oct 282012
 

I remember months ago when I was so excited to hear that Microsoft would be releasing their own tablet. I swore I would be one of the first people to get their hands on these devices… Unfortunately, things didn’t work out the way I thought.

Since refined details were published regarding the specifications and capabilities in the time since, I’ve changed my mind, sadly.

While the device is still a “rock-star” device, with the capabilities it does have, I’m not so sure it’s designed for the professional. With that being said, there is a “pro” version coming out, however it will be slightly larger, slightly heavier, and will be running on the x86 architecture, instead of the lightweight, battery saving ARM architecture.

It’s in my opinion that they should have allowed the Windows RT release to be “upgraded” to a domain join-able version, that supports GPO, etc…

 

Few reasons why I decided NOT to purchase the Microsoft Surface

1) Lack of LTE / cell modem capabilities – I envisioned myself having access to the internet wherever I went. I wanted to have the ability to edit Microsoft Word or other Office suite application files seamlessly live over VPN. This way I could go to meetings, take notes, and have them stored directly on my servers back at the office. Not only does lack of LTE stop this from happening, but it also stops me from having the ability to read/write e-mails on the go wherever I am. I want to get e-mails instantly like I do on my phone, I don’t want to have to wait for a WiFi hotspot to become available.

2) Lack of domain capabilities – It would have been nice to be able to join it to the domain for single-sign on, and access to network resources.

3) Lack of retail locations in Canada – I remember seeing something that they had a Microsoft Store in Edmonton, I tweeted the Microsoft Store twitter account and asked if they are planning on opening a location in Calgary. They replied and said they have one in Edmonton. I’m not willing to drive 350 kilometers to just play with a device to see if I want to purchase one, then drive the 350 kilometers back (possibly without the device if I chose not to purchase it).

4) No clear explanation on application support – While there is a Windows Store that has applications for the Metro style interface, there is a lack of information on actual windows application support for building applications on the ARM architecture for Windows RT. It would be awesome if people could start building windows applications for the ARM architecture, but from what I have read that isn’t the case.

 

It’s unfortunate that there are these shortcomings. I would have loved to flash this device in the face of iPad lovers. However since I won’t be able to SSH using Putty compiled for ARM, and since I won’t have access to e-mail wherever I am, and won’t have access to any of my office documents on my servers wherever I am, I don’t think I’ll be pulling the trigger anytime soon.

Some other companies are manufacturing Windows RT tablets with built in LTE capabilities, however I much prefer to have it built in to the beautiful engineered Microsoft Surface.

Jul 232012
 

Interesting story:

On the weekend, my Trixbox VoIP PBX (which runs Asterisk) failed. Unfortunately, during the restore process the hard drive also blew up. Temporarily I setup a ML350G5 as a temp VoIP PBX, however today I had the chance to setup an old Acer Aspire Netbook which I had sitting in a box as my new permanent VoIP PBX.

I used a USB CD-Rom to install Trixbox, as for some reason I couldn’t get it to load the kickstart file during a grub boot off a USB key, also couldn’t get it to mount the NFS install export (maybe the kernel didn’t have support for NFS?).

 

The netbook had decent specs:

Dual-Core 1.5Ghz Process

1GB Ram

Battery (this means I don’t have to put it on my UPS for all my server equipment)

 

Got it setup and it’s running great! 🙂

Jun 302012
 

As most of you know, I have 2 Raspberry Pi.  One has tons of storage, that I do a lot of my hacking on, and development. The other has less storage, I usually keep clean, and don’t do anything funky on it. I like the second to have a clean usable image (even though this doesn’t make sense, and I don’t use it in production, I call the second my “Production Pi”, the first is my “devpi”).

For my development Raspberry Pi, I maintain two installs. One install is a Fedora 17 ARM install with my own kernel that all fits on a 16GB SD card, but my second install (the crazy packed install) actually boots it’s kernel off a small 2GB SD card, and loads it’s rootfs off of a external 500GB USB drive. The SD card install was easy, however the rootfs on USB took a bit of work, I’m going to share with you how I did this.

To have your Raspberry Pi use a rootfs off a USB drive you will need to know what modules your USB drive currently uses on a un-modified Raspberry Pi kernel, and you will need a seperate linux box, and SD card reader/writer to prepare the rootfs and perform these instructions. In my case, I used my 16GB install as a template and cloned it to my 500GB USB drive.

DISCLAIMER: These instructions are what I did to setup a rootfs on a USB device. I am in no way telling you to follow these instructions, and if you do, and any damage occurs I am not liable or responsable. These instructions if not followed properly can cause damage to your existing linux install on your computer and your Raspberry Pi. Only follow these if you understand them.

 

Let’s get started, first let’s prepare the new bootable SD card which loads our kernel:

1) First we want to take a image of the partition table, and boot partition we already boot from on our existing working Raspberry Pi Fedora 17 ARM install. We only want to image the partition table and boot partition. First we use fdisk to find out how big the boot partition is:

fdisk /dev/sdb

Press “p” and hit enter to print the partition table. In my case it outputted:

Welcome to fdisk (util-linux 2.21.2).

Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Command (m for help): p

Disk /dev/sdb: 16.0 GB, 16009658368 bytes
64 heads, 32 sectors/track, 15268 cylinders, total 31268864 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *          63     1044224      522081    c  W95 FAT32 (LBA)
/dev/sdb2         1044225    27074558    13015167   83  Linux
/dev/sdb3        27074559    31268863     2097152+  83  Linux

This shows that my boot partition starts at sector 63 and goes till 1044224. Each sector is 512 bytes, for a total boot partition that is 534642688 bytes, or ~509 megabytes. The reason that the partition starts at sector 63, is because everything prior to that on the SD card contains partition information, etc… When we image, we will start from the beginning to grab that partition information. After you record the End sector for your boot partition, hit “q” and enter to exit fdisk.

Let’s create the image. In order to do this, you need to find out what sd* device your SD card is on your current system. If you do a dmesg after connecting the sd card, you should see it. Make sure you choose the correct sd* or you could damage your current linux system. In my case, after I connected the SD card to my linux workstation, it became /dev/sdb.

Let’s make the image:

dd if=/dev/sdb of=image.img bs=512 count=1044224

This will create an image called image.img of the boot partition and partition table. Notice how we use a bs=512 (this is because each sector is 512 bytes), and a count=1044224 (this is the end of the boot partition).

 

2) Let’s copy the SD Card’s rootfs to a external drive! Hook up your empty external hard drive. On this drive we will do a few things. First, partition the drive so we have two partitions (one for the rootfs, and one for swap), then we will make the filesystems, and finally copy over the rootfs to the new drive.

In my case, when I hooked up my external drive to my computer, it was assigned as /dev/sdc, the SD card is still /dev/sdb. Keep in mind this may be different on your computer, if you use the incorrect values, you may actually damage your current linux workstation. You can always use dmesg after hooking up a usb device to find out what sd* it was assigned.

First let’s create a patition table on the usb drive. Again, on my computer the usb drive is sdc, your’s may be different.

fdisk /dev/sdc

In fdisk, we will create a partition for an ext4 filesystem. Press “n” for new partitions, follow the instructions, create a primary partition, and let fdisk choose the start and ending. This will use all the space on the hard drive. When done creating, press “w” and enter to write the partition table. This created the /dev/sdc1 partition. Now we need to create the filesytem and label it.

mkfs.ext4 /dev/sdc1 -L rootfs

This creates a ext4 filesystem and labels it “rootfs”.

 

3) Let’s copy the old rootfs to the new one. First we need to mount both of them, and copy everything over.

To do this, we are going to create two folders which we will mount both of the rootfs too.

mkdir rootsource

mkdir rootdest

Then we will mount them:

mount /dev/sdb2 rootsource/

mount /dev/sdc1 rootdest/

And finally, we will copy over the rootfs and then unmount the mounts

cp -afv rootsource/* rootdest/

umount rootsource/

umount rootdest/

You have now copied over the root filesystem from your SD card, to the external USB drive.

 

4) Now we are done with the original SD card, you can remove it, and put in your new SD card which will be used to boot the kernel.

First we need to write the image that contains the partition table, and boot partition. We do this by writing the image file you created above to the NEW SD card. Again, MAKE SURE YOU KNOW WHAT /dev/sd* device the new card is being recognized as.

dd if=image.img of=/dev/sdb

This will write the image to your new SD card. After this, even though we only took the partition table, and the boot partition, the partition table still contains information of the old rootfs. Let’s clean this up (even though we don’t have to).

fdisk /dev/sdb

Inside of fdisk, we will hit “D” to delete all bogus partitions EXCEPT for the boot partition. In my case I had 2 I had to remove, partition 2 and 3. You will probably only have one, which will be partition 2. After you’re done deleting, hit “w” and enter to save.

 

4) Now we have to go to the boot partition and prepare it to boot off the USB drive.

Let’s create a directory to mount and work inside of, then we will update the cmdline.txt file which the Raspberry Pi uses for information to boot from.

mkdir bootmount

mount /dev/sdb1 bootmount

nano bootmount/cmdline.txt

We are now looking at the cmdline.txt file. We need to update it to boot from the USB drive rootfs partition which we created above. Keep in mind, when you hook the USB drive up to your Raspberry Pi, it will be the only /dev/sd* device, so it will probably appear as /dev/sda if you have no other USB drives connected to it. Update yours, here’s an example of mine:

dwc_otg.lpm_enable=0 console=ttyAMA0,115200 kgdboc=ttyAMA0,115200 console=tty1 root=/dev/sda1 rootfstype=ext4 rootwait text

Save and close the cmdline.txt and then unmount.

umount bootmount

 

So here is where we sit (in order):

-We imaged the old boot partition

-We prepared the new USB drive

-We copied the rootfs to the new USB drive from the SD card

-We wrote the boot image to the new SD card

-We configured the new SD card to boot using the external drive as a root fs.

 

All we have left to do is to is connect the USB drive to your Raspberry Pi, and insert the new boot SD card in to the Pi as well. Boot your Raspberry Pi, if all is well, it should boot off your new rootfs (on the USB drive). This will improve speed since the USB drive is WAY faster than the SD card, and now you’re SD card will only be used to boot the kernel.

Important Notes:

-If these instructions don’t work, chances are your USB device may require a driver that is not built in your current Raspberry Pi linux kernel, you will need to identify what module your USB device uses by issuing a “lsmod” when you have booted off your old SD card, then re-compiling the kernel with this built in to the kernel and NOT as a module. Instructions on compiling a kernel can be found here: http://www.stephenwagner.com/?p=616

-If you really feel comfortable with Linux, after you do this, you could re-size the partitions, and create a SWAP partition for your Raspberry Pi. ONLY setup a Swap partition if you are using an external USB drive, as SD Card’s are NOT fast enough, and swap’ing can actually limit the life of a SD card due to the I/Os.