Patrick’s Rants


Milk Chocolate and More Chocolate Gifts from Chocolate.com

7/1/2007

Kicking And Screaming

Filed under: Geek News and Stuff — site admin @ 10:42 am

Time for the annual server upgrade at the office.

I’d been holding off because the K12LTSP install disks refused to recognize the hard drives. I kept getting prompts to format the drives. That certainly wasn’t going to work, there’s too much data on that machine. I had also been holding off on upgrading the kernel on it because lilo doesn’t work without rebooting into something like Knoppix to run it.

I walked into the office Saturday afternoon, a stack of CDs in hand. I inserted the first install CD and rebooted the machine, the familiar screens of the install come up. And that’s where it stops being easy…
I got all the familiar screens to set up the keyboard and then the searching for previous fedora installation screen. And that’s when it asks me to format the drive. Damn, it can’t see the RAID array either. Well, time to try something different. I really don’t remember how many times I boot and rebooted and switched between the Knoppix CD and the K12LTSP install CD. I did learn how to setup the raid array so that I could use it; Knoppix uses mdadm while K12LTSP uses startraid. One attempt involved starting up the install disk, switching to a different console and running the startraid command. Unfortunately, the install script then dies when looking for the installed packages.

It’s time to stop procrastinating, put grub on this thing and get the latest kernel so I reboot to the running installation, setting aside my install disks. I uninstalled lilo from the MBR (Master Boot Record) by running lilo -u (a quick search for lilo uninstall turns up this Microsoft page which humorously states:

NOTE: The following procedure is not supported by Microsoft and is performed strictly at the discretion of the user. Microsoft assumes no liability for lost or corrupted data. This procedure should be performed only as a last resort.

. No, I don’t mean the lilo procedure, they’re referring to fdisk /mbr. They have no such disclaimer for users wanting to boot to Linux to remove lilo from the MBR)

I use yum to install grub and then use the grub-install script. I also install the latest kernel. It’s all going well. Until the reboot. The machine doesn’t come all the way up, I get the grub prompt. It takes some time but I figure out that I can load the kernel, the initrd image and then issue the command to boot. This isn’t going to work – no way, no how. I can’t see my partners typing in the commands to load the kernel, then the initial image, then boot. I can’t see me doing that. I’m no grub guru, so I have to use Google to find all my information. The grub manual has what to do, but not what to do when that fails. I end up with several different errors over the next few hours, but nothing that looks like the splash menu that comes up on the other server. I compare the grub.conf files from both machines, but nothing was different between them.

I struggled with the grub install script for way too long before deciding to call it a day and head home, sometime around 9:30pm – I had been at the office for somewhere around five hours by that time. Stressed out, I went to bed falling into a restless sleep that had me back awake at midnight. By 12:30 am, I decided to head back to the office to finish the install – I hoped.

I fought with the machine, booting and rebooting. Sometimes I would have a grub hard disk error. That’s where nothing happens except the error message pops up on the screen and the computer just stops. I finally found a fix. Essentially, the grub commands were echoing onto the screen:
Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2
/boot/grub/grub.conf"... succeeded

It looked good, especially the “succeeded” part. But it still doesn’t boot. A little further down the discussion I find a “fix”.

Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2
/boot/grub/grub.conf"... succeeded
Done

grub> install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2 /grub/grub.conf
So basically, I had to run the grub command “setup” and then run a second “install” that chopped the “/boot” part of the path from /boot/grub/grub.conf. I rebooted the machine around 3:20am and it worked. Now, for that upgrade. A version upgrade in Fedora requires a new install of redhat-release, or fedora-release so that yum starts to look for the new release packages. The method I use in this case is to download the new release package using a small script that I wrote “getrpm” which does just that, gets the rpm package listed on the line next to it. I install the new version, run yum clean to purge the cache and local files, upgrade yum, and then ran yum upgrade. The list of packages needing upgrading was around 890 packages, I hit “y” and went home. I was in bed by 3:30am.

Later that morning, Sunday, I got up to check the progress. I could ssh in from home but the server hadn’t upgraded. I had some old kernels that had to be removed first. Then I had a conflicting package. I uninstalled the package and tried yum upgrade again. This time, there were no conflicts. And when I went in on Monday to check on things and meet with a client, nearly everything was working, except printing in my install of OpenOffice.org. All in all, not a bad upgrade, once that damned grub was figured out.

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress
Comments, opinions and drivel © the poster. Satire protected under Fair Use. Opinion protected under First Amendment (see: Constitution of the United States)
Nothing on this site should be construed as tax, legal, or investment advice. If you need any of those things, seek out a professional whom you can pay for such advice. Posters cannot be held liable for your failure to perform your own due diligence.

Bad Behavior has blocked 26 access attempts in the last 7 days.