Nobody Knows The Trouble I’ve Seen: Part Deux
On January 5, I was having breakfast with J at Mike and Rhonda’s when my cell phone interrupted my casual coffee enjoyment. It was the office. The workstations were all off line. To me it sounded like a network issue. I stepped outside to finish my call, people talking on their cell phones in restaurants is one of my pet peeves (closely followed by people on their cell phone in any public place). To me it sounded like a network issue. Step by step each suggestion failed in turn. I resigned myself to cutting breakfast short and heading into the office before the paying job.
Once at the office I went through the steps that I was assured had already been taken. Power cycle the network switches, the server and then workstations. Nothing. Each phase that should have – could have – worked didn’t. I tailed the logs, watched as the workstations/clients booted up and nothing made sense. Then the screen went dark and the BIOS screen appeared. The server had just spontaneously rebooted; never a good sign. Nothing seemed to work and the office was shut down for the day. I went to my full time job for the remainder of the day.
After getting out of work at 7:00pm, I headed back to the tax office and again to read through the logs to see if there was anything that I missed. Workstations still would not PXE boot. The server spontaneously rebooted on me a couple more times and I resigned myself to the fact that the server at somewhere around 7 years old had reached the end of its life. The hard drives were reporting (using SMART) that they were aging, occasionally showing sectors not available. I knew at some point the server would need to be migrated, but I wasn’t ready. I really did hope to get another tax season out of that machine – it was not to be.
On January 6, I unstacked the stash in the corner. Imagine a tower of tower computers placed next to the wall width-wise and two lengths left to right. Imagine that tower at two to three high. Yep the bane of every geek’s non-geek wife (or non-geek husband as the case ma be). The overt hoarding of old computers just waiting for the day when they can be salvaged and recombined into a working machine. These machines are only awaiting the day when their geek overlord, master of their existence, has the chance to evaluate and resurrect them. I unpiled that stack looking for a gem that I knew was there – the beige beast.
The beige beast is pretty impressive. It houses the IntelĀ® Server Board SE7501BR2, has dual hot swap power supplies, 5 hot swap fans with internal wind tunnels (firing this puppy up gave me wind chill) 5 scsi hard drives (well, 5 possible. Only one actually was installed a comparatively small 18g Ultra 320 drive), intrusion detection, dual Xeon 2.4ghz chips. Now this was a hand me down (thanks Steve) so there are no complaints. Some of the hardware is absolutely impressive – dual Xeons in a box that was decommissioned sometime around 2007 and ran Windows 2000 Server. That box cost a pretty penny when it was originally deployed. Today you could grab the board (used) on ebay for less than $20.00. Of course the case is not included at that price.
Wednesday night I began my installation journey. I burned the K12LTSP v5EL dvd. I let the drive select the speed and it warned there might be an underburn
. Naturally I stuck it in the drive and booted – what could go wrong? It did not see the disk. It did not boot. That’s what could go wrong. I grabbed another blank dvd and set the drive speed to 5x. No warnings. Excellent. I swapped the dvds in the drive and… same thing. Then it dawned on me. The drive in the machine was CDROM. I pulled a dvd drive that I have on the shelf, powered the server off, and temporarily installed the dvd drive. I powered up the server and it still choked.
After weighing my options, I decided to pull out the dvd drive, install an ide hard drive, boot Knoppix up, format the scsi drive, copy over the diskboot.img from my Debian workstation to the scsi drive, then dd the image onto the ide drive. Phew! Reboot the server popping out the Knoppix cd. Then I had to remember how to use NFS. Simple, add a single line to /etc/exports: /cdrom 192.168.1.0/255.255.255.0(rw,sync,no_subtree_check)
and restart nfs. Of course I had to mount the dvd in /cdrom directory. Switching over my old school KVM switch (4 ports with a rotary dial) I typed in the settings to get the server to see the NFS install. It did. It was a thing of beauty. The install seems to be a bit “dumbed down” from the last time I had to install LTSP – when the now dying server was new. There were just a few options, which packages do you want, how is the network set up, keyboard, language, time zone and root password. And a simple partition selection screen. And away it went.
It got to the end of the install and it was time to reboot. It power cycled and the CentOS 5 splash screen came up. Success! A couple of first boot things to take care of, reboot to make sure it still worked and shut it off for the night. I was supposed to have the wind tunnel powered off by 9pm and it was past 10:30pm.
Thursday morning I pulled the ide drive out, since I only needed it to install, and booted. No boot media found. CentOS had put the boot loader in the first hard drive, the one that I just finished putting on the shelf. Since it didn’t boot, I figured I would just boot from Knoppix again, mount the drive and install the grub boot loader to the new master boot partition on the scsi drive; like I’ve done many times before. Except Knoppix wouldn’t find the drive. Nothing in /dev/ and Knoppix has been a pretty good tool for things like this. It turns out that since the drive was installed as a logical volume using LVM Knoppix just didn’t find it. I had to do some incantations and magical hand waving to find out which volumes there were (by which I mean I can’t find the same answer on Google that I did before). Simply put, I had to figure out which logical volumes and which physical volumes existed (using lvdisplay and pvdisplay) so I could mount the drive, chroot into and run grub. And it worked. It booted.
I carried the beast downstairs to the car and drove to the office. I won’t bore you with details of how long I sat in our server closet moving files off the limping machine. I got the network back online. Minimal, but online. It was a busy couple of days. Too bad it wasn’t over. Yes, there’s a part trois so stay tuned.




