Data Center Karma
Aug. 16th, 2008 10:02 pmI'm at the office currently. I arrived about 9 AM to assist the Data Warehouse team in moving a number of servers from our Reston data center to our Urbana location.
Since I know the Urbana facility intimately and am fairly good with the Sun Microsystems equipment they were moving, my assistance was gratefully accepted. The shutdown team in Reston would do most of the dirty work of updating configuration settings and there would be others in Urbana to do the required work on the other machines arriving here. I would help install one Sun E6800 server, confirm the clustering settings were functional and then go on my merry way. Right.
The Reston team made a total hash of these machines. I mean, really, how do you wipe out the /etc/shadow file and replace it with a chunk of /etc/vfstab?
And it seems that the latest version of the project plan wasn't forwarded to me: I was to work solo to rebuild all of the Sun machines in Urbana. Okaaaaay.
Half of the servers were functional fairly quickly, about two hours each to attach all of the cabling, make some minor tweaks to the operating system, boot them single-user, run some tests, boot them into multi-user mode and turn them over to the applications teams for their testing.
One machine needed considerable reworking of the SAN zoning but otherwise had no issues.
And one machine has come straight from hell.
It seemed every time I fixed something, there would be some other disaster. I fixed the configuration file munging, but the network interfaces wouldn't work correctly. I redo their configuration, then the SAN connections would vanish. I finally fixed all issues and then rebooted the machine whereupon the superblocks of the boot partition all crapped out. I've spent the past hour making updates to the alternate boot disk of the server but it's showing signs of dying too so we're quickly trying to jury-rig a third boot disk so we can copy the critical files to it immediately for a faster recovery.
I might be outta here by midnight but we'll see.
Edit: It's 11:40 PM and I'm heading for home.
Since I know the Urbana facility intimately and am fairly good with the Sun Microsystems equipment they were moving, my assistance was gratefully accepted. The shutdown team in Reston would do most of the dirty work of updating configuration settings and there would be others in Urbana to do the required work on the other machines arriving here. I would help install one Sun E6800 server, confirm the clustering settings were functional and then go on my merry way. Right.
The Reston team made a total hash of these machines. I mean, really, how do you wipe out the /etc/shadow file and replace it with a chunk of /etc/vfstab?
And it seems that the latest version of the project plan wasn't forwarded to me: I was to work solo to rebuild all of the Sun machines in Urbana. Okaaaaay.
Half of the servers were functional fairly quickly, about two hours each to attach all of the cabling, make some minor tweaks to the operating system, boot them single-user, run some tests, boot them into multi-user mode and turn them over to the applications teams for their testing.
One machine needed considerable reworking of the SAN zoning but otherwise had no issues.
And one machine has come straight from hell.
It seemed every time I fixed something, there would be some other disaster. I fixed the configuration file munging, but the network interfaces wouldn't work correctly. I redo their configuration, then the SAN connections would vanish. I finally fixed all issues and then rebooted the machine whereupon the superblocks of the boot partition all crapped out. I've spent the past hour making updates to the alternate boot disk of the server but it's showing signs of dying too so we're quickly trying to jury-rig a third boot disk so we can copy the critical files to it immediately for a faster recovery.
I might be outta here by midnight but we'll see.
Edit: It's 11:40 PM and I'm heading for home.