Cycling and bicycle discussion forums. 
   Click here to join our community Log in to access your Control Panel  


Go Back   > >

Touring Have a dream to ride a bike across your state, across the country, or around the world? Self-contained or fully supported? Trade ideas, adventures, and more in our bicycle touring forum.

User Tag List

Reply
 
Thread Tools Search this Thread
Old 07-29-08, 09:32 AM   #1
NeilGunton
Crazyguyonabike
Thread Starter
 
Join Date: Nov 2003
Location: Albany, OR
Bikes: Co-Motion Divide
Posts: 684
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 29 Post(s)
Crazyguyonabike down

I woke up this morning to find the server completely down - no ssh even. I had to log in via the remote KVM to see that the server had been rebooted and was needing a manual file system check, which takes a while with ext2 (I use that rather than ext3 for speed). I called Chi Networks (our colo hosting company) and was told that there was a power outage at the XO datacenter last night, so a lot of stuff went down. He's still waiting for an explanation from XO Networks. This sort of thing should never happen, they have a UPS in there the size of a small bus. Anyway, I'm busy doing the filesystem check and then I'll have to check a bunch of other stuff to make sure it's ok (database integrity etc) as you do after a hard crash. This is just to let people know what's going on; hopefully we'll be back up as soon as possible.

Neil
NeilGunton is offline   Reply With Quote
Old 07-29-08, 10:13 AM   #2
jeffpoulin
Senior Member
 
Join Date: Jul 2008
Bikes:
Posts: 2,178
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 2 Post(s)
Thanks for the update. I was in the middle of reading about Greg White's 2004 tour. Good stuff, but I'll have to wait to finish it.

Good luck with the restore. If you need any help, let me know. I'm a linux system administrator with over 10 years experience. I'd happily donate my time to get your site back up.
jeffpoulin is offline   Reply With Quote
Old 07-29-08, 10:58 AM   #3
NeilGunton
Crazyguyonabike
Thread Starter
 
Join Date: Nov 2003
Location: Albany, OR
Bikes: Co-Motion Divide
Posts: 684
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 29 Post(s)
Thanks Jeff, I think we're back up now. Bad news is that the little external USB drive which I have on there for local backup isn't registering with the system any more. It's like it's just not there. I'm asking the datacenter guys to eyeball it to make sure it's physically present, if so then I guess I'll have to have them send it back to me so I can test it. Rats, these things always seem to happen together, don't they. Sigh.

Anyway, the site should be up again now, barring more datacenter outages.

Neil
NeilGunton is offline   Reply With Quote
Old 07-29-08, 11:28 AM   #4
paxtonm
Senior Member
 
Join Date: Sep 2005
Location: Hollister, CA
Bikes: Bianchi San Jose, Mercian King of Mercia
Posts: 455
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Not quite yet

Hi Neil,

You're still not up from my end. Best of luck with the trouble-shooting. I only wish I could contribute something useful.

Mark
paxtonm is offline   Reply With Quote
Old 07-29-08, 11:36 AM   #5
NeilGunton
Crazyguyonabike
Thread Starter
 
Join Date: Nov 2003
Location: Albany, OR
Bikes: Co-Motion Divide
Posts: 684
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 29 Post(s)
I had to bring the server down again in order to add indexing and journaling to the filesystem (ext2 -> ext3 for geeks, plus dir_index). On reboot the filesystem has to be checked again, which takes a wee while. Sorry about that. Hopefully the ext3 system should be a bit more resistant to losing files after hard crashes - it seems that we lost yesterday's log file for crazyguyonabike, which is surprising to me and kind of a bummer. I knew ext2 was less robust than ext3, but I didn't realize you could lose whole files like that. Oh well, you live and learn.

We should be back up shortly, sorry for the delay...

Neil
NeilGunton is offline   Reply With Quote
Old 07-29-08, 12:06 PM   #6
jeffpoulin
Senior Member
 
Join Date: Jul 2008
Bikes:
Posts: 2,178
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 2 Post(s)
Neil, you may have had filesystem corruption on the USB drive too. In that case, your drive would fail to mount.

To see if it detects a USB drive at all, you can run "lsusb -v" or "dmesg" (as root). With dmesg, you'll want to look for lines with "usb-storage" and the next dozen or so lines afterwards.

If the drive is there, you can see if the partitions are in order by running "fdisk -l /dev/sdX" (replace sdX with your USB's device name like sdb, sdc, etc...).

If the partitions look good, then try running a file system check on it. You can run "fsck /dev/sdX1" (again, replace sdX1 with your device name and partition number like sdb1, sdc1, etc...).

Good luck!
jeffpoulin is offline   Reply With Quote
Old 07-29-08, 12:14 PM   #7
jamawani 
Hooked on Touring
 
Join Date: Mar 2004
Location: Wyoming
Bikes:
Posts: 2,268
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Thanx, Neil.

Not to mention - - -
That a lot of folks out here in blogland think you are incredible!
jamawani is offline   Reply With Quote
Old 07-29-08, 12:17 PM   #8
NeilGunton
Crazyguyonabike
Thread Starter
 
Join Date: Nov 2003
Location: Albany, OR
Bikes: Co-Motion Divide
Posts: 684
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 29 Post(s)
Hi Jeff,

I don't know how to interpret all the output from lsusb -v, but I had already done a simple lsusb and dmesg, and nothing comes up. The lsusb doesn't list any drives attached at all, and dmesg doesn't have anything mentioning /dev/sda, which is what the drive should come up as. So it's like it's not even plugged in. At this point, not being next to the machine, I'm not sure what to do about this except get the techs to unplug the drive and send it back to me... or maybe, if they are up for it, trying to plug it into one of their linux boxes and reformatting if it's not completely fried. But as it never even registers, I'm thinking this must be a more severe problem than corrupt files - the device itself is just never appearing. Maybe there was some kind of power surge or spike when the event happened last night, and maybe somehow that got transmitted through to the poor little USB drive. I mean, it's been working fine up to now, and it seems a bit of a coincidence that we should have this major power outage, and right after that the USB drive dies. Not sure what else to try except getting them to send the thing home to me and maybe get an RMA from NewEgg for an exchange - maybe it was just a bad drive, it does happen. Bummer either way. I'm open to other suggestions...

Thanks,

Neil
NeilGunton is offline   Reply With Quote
Old 07-29-08, 12:43 PM   #9
jeffpoulin
Senior Member
 
Join Date: Jul 2008
Bikes:
Posts: 2,178
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 2 Post(s)
Hmmm, that does sound bad. If the usb drivers are loaded in the kernel (you can use "lsmod | grep usb" to check), the drive hasn't been unplugged, and it has power, then the likely conclusion is that your drive is dead (as you've already guessed). If you're lucky, the hard drive itself may be okay and it's only the enclosure's circuit board that got fried. In that case, you could try putting the drive in another enclosure and (hopefully) get access to your backups again.

Jeff
jeffpoulin is offline   Reply With Quote
Old 07-29-08, 04:22 PM   #10
NeilGunton
Crazyguyonabike
Thread Starter
 
Join Date: Nov 2003
Location: Albany, OR
Bikes: Co-Motion Divide
Posts: 684
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 29 Post(s)
Good news - the drive came back to life after a datacenter tech unplugged it and plugged it back in again. Now it registers again just fine as /dev/sda1 and I can mount etc. I'm guessing it got into some kind of weird state as a result of the power outage, and even with a reboot, the power to the USB was maybe never completely cut off. If this happens again, maybe I'll try a complete power down via the remote KVM interface in order to "reboot" the USB drive out of its funk. Strange stuff all round.
NeilGunton is offline   Reply With Quote
Old 07-30-08, 09:16 AM   #11
jeffpoulin
Senior Member
 
Join Date: Jul 2008
Bikes:
Posts: 2,178
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 2 Post(s)
That's great, Neil! Thanks for getting the site back up. I was able to finish reading Greg White's "Leave of Absence" tour.
jeffpoulin is offline   Reply With Quote
Old 07-31-08, 09:58 AM   #12
ebrady
Senior Member
 
Join Date: Aug 2007
Location: Delaware, OH
Bikes: Giant OCR2, Puegeot Altitude 21 MTB
Posts: 166
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Quote:
Originally Posted by NeilGunton View Post
This sort of thing should never happen, they have a UPS in there the size of a small bus. Anyway, I'm busy doing the filesystem check and then I'll have to check a bunch of other stuff to make sure it's ok (database integrity etc) as you do after a hard crash. This is just to let people know what's going on; hopefully we'll be back up as soon as possible.

Neil
You are correct, a power outage at a data center is something that must never happened! If they do not give an acceptable reason why this happened, you may best be served by finding a different center to host the site.

I used to work for a company that developed and manufactured UPS equipment for data centers. While there are exceptions, the majority of times where there is a loss of power to the servers(called "dropping the load"), it is caused by the owner/operator.
ebrady is offline   Reply With Quote
Old 07-31-08, 12:11 PM   #13
vik 
cyclopath
 
vik's Avatar
 
Join Date: Apr 2006
Location: Victoria, BC
Bikes: Surly Krampus, Surly Straggler, Pivot Mach 6, Bike Friday Tikit, Bike Friday Tandem, Santa Cruz Nomad
Posts: 5,265
Mentioned: 0 Post(s)
Tagged: 0 Thread(s)
Quoted: 0 Post(s)
Neil I've been following the CGOAB down threads out of interest sake. Frankly I barely comprehend what you are talking about specifically although I do understand the larger issues that you are dealing with to keep the site running reliably. In any case I have a new respect for the work that goes on behind the scenes at CGOAB.

This summer I've met at least 10 groups of cycle tourists and in every conversation CGOAB has come up at some point. Either they had a journal there, found info for planning their trip on your site or recommended a journal to me that was interesting. It has become a funny common thread across the cycle tourist culture....kind of like "regular folks" meeting at the water cooler and talking about what happened this week on "Friends"...=-)
__________________
safe riding - Vik
VikApproved
vik is offline   Reply With Quote
Old 07-31-08, 01:39 PM   #14
NeilGunton
Crazyguyonabike
Thread Starter
 
Join Date: Nov 2003
Location: Albany, OR
Bikes: Co-Motion Divide
Posts: 684
Mentioned: 1 Post(s)
Tagged: 0 Thread(s)
Quoted: 29 Post(s)
Quote:
Originally Posted by ebrady View Post
You are correct, a power outage at a data center is something that must never happened! If they do not give an acceptable reason why this happened, you may best be served by finding a different center to host the site.

I used to work for a company that developed and manufactured UPS equipment for data centers. While there are exceptions, the majority of times where there is a loss of power to the servers(called "dropping the load"), it is caused by the owner/operator.
I have been told by Chi Networks that XO Networks (they run the datacenter) was doing a failover from the main UPS to the backup, when the switch failed, causing the blackout. It wasn't clear if this was a planned failover test, or if they were just doing planned maintenance on the UPS and failed during the switchover then. In any case, it was basically a bad piece of equipment, a case of the system that provides failover itself failing. I guess that kind of thing happens occasionally, and is kind of hard to plan for... I'm not an expert in commercial grade UPS, but a bad switch is a bad switch, and if it fails during failover then I guess you're screwed.

I am in the process of setting up a dedicated server with iWeb, who are based in Montreal CA. They seemed to have some pretty reasonable prices - a larger hard drive was a primary requirement, and many hosting services have fairly anemic offerings in that department, only going to bigger drives with correspondingly more CPU and RAM etc, which we don't really need. Anyway, they have a package of 2.4 GHz processor, 1 GB RAM, 320 GB SATA drive, 10 Mb port (upgradable to 100Mb for about $10), and 3000 GB transfer per month for $69, with $49 setup fee (I could have waived that if I signed up for 12 month contract, but I feel more comfortable paying up front and being able to cancel month-to-month if I want to). I was originally just looking for a box that I could set up as a MySQL slave and backup image repository, but with this server I could probably actually set it up as a warm spare in case my server completely blows up. They are setting it up with a basic install of Debian Etch, which I'll upgrade to Lenny to match the current server, then I'll build all the software so that it's the same as our current server. It can then be our backup while I am on the road over the next couple of weeks (as we move to Oregon from St Louis). It will feel good to know there is some kind of backup server while my own workstation is offline (usually it acts as the slave and image backup). I also have two external drives here at home which are rsync backed up nightly, but that wouldn't be happening on the road either. With RAID0, there is a bigger chance of a hard drive crash bringing us down completely, so it will be good to have a backup server available. Not sure how long it would take to actually get it up and running as the production, but at least it'll have all the data.

Sorry to clog up the touring forum with this stuff, apologies to anyone who's not interested in the crazyguyonabike soap opera!

Neil
NeilGunton is offline   Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -6. The time now is 08:56 PM.