Nobody Knows The Trouble I’ve Seen: Part Trois
Looking back over the previous entry I realize that I made the install seem just slightly easier than it was. When I wrote,
format the scsi drive, copy over the diskboot.img from my Debian workstation to the scsi drive, then dd the image onto the ide drive.
I forgot to mention that I had already (sorta) done this once. I copied the diskboot.img to the scsi drive and ran the install. The problem with installing to the same drive that you are using the disk image on is – it doesn’t really work. Oh, it pretends to work, mounting the image as a loop back file system, but it doesn’t completely and correctly install. At least not with CentOS 5, which is my baseline. This is why I ended up with the IDE drive in the machine too. It’s also why I installed the OS twice in one night. But enough about that time.
A few days after this server was installed as a crutch, I got another phone call. No network booting. Which is where I started. I went through a couple of things that might be wrong. The ethernet cable was in a location that it could get bumped so I had them wiggle the cable. It worked. Until I got the next phone call. I resigned myself to going into the office to work on this machine again.
I had to grab a chair, connect up a keyboard, mouse and monitor and I sat down in front of the beast. The screen did not come alive. Several boot cycles later I decided the old 18g scsi drive must have given up the ghost. It didn’t work in any of the hard drive slots and Knoppix would not see it when I booted up that way. So following my 2 hard drive install I came home, picked up 2 IDE drives from the shelf (no spare Ultra 320 drives here) and drove back to the office. I cracked open the box and stuck in the 2 drives. It was the same dance as before, boot Knoppix, copy the netboot image, boot from the netboot image and run the install from the crippled office server. I was able to keep the failing server running long enough to get my install done. I rebooted and everything looked beautiful.
Walking to the workstations I realized I wasn’t done. The screens showed a gray hash-marked background with an X cursor. No logon prompt. I spent until midnight or later that night trying to edit this config file and that config file. Nothing worked. And the thing that was bugging me is that I was using the same (copied from the old server) config files, that until the scsi drive died, worked. I finally stumbled upon the fix, you have to go into the gui login on the server:
Now goto System -> Administration -> Login Window
Now click on “Remote”
On the drop down menu of styles select “Same as Local.
This is the first time I have had to set this since I started using K12LTSP in 2002. I’m not sure why this install – done mere days after the last one – required this change, but it did. Even worse, the fix was not at the top of my searches or I might have tried it first.
I also started running into trouble with backups running from the Windows 2008 Server to the Linux server. It turns out that using Cygwin rsync over ssh has some potential problems. The first is that rsync hangs. And my little bash script wasn’t set to only run one copy at a time (by using a lock file) so rsync was running multiple times and hogging all the CPU and RAM. The final solution was to run Rsync outside of SSH and use lock files.
In the end this crutch held me over until the arrival of the new T300 Poweredge (next in the saga)
