jsd / Member

Forum Posts Following Followers
1492 53 154

We apologize for the inconvenience

So today was one of those days where you really don't want to get out of bed. Around 2PM all the power in the office went out. That was kinda fun. You hear a bunch of people yell "OH CRAP" and then they emerge, blinking, from staring at their screens to re-engage with the real world. We were wandering out, making dumb jokes, and then the lights came back on. Back to work! My computer did not want to start up again, and my phone was complaining about no DHCP or FTP servers (we have very complex phones at CNET). Eventually I got it up and running, and was struggling with getting email working and then... WHAM. All the power went out again. Everybody gathered out in the main area and started laughing and joking again. A bunch of people whipped out their Nintendo DS and started playing Mario Kart. Free Image Hosting at allyoucanupload.com Free Image Hosting at allyoucanupload.com The lights came back on again. The DS players did not look up. "Fool me once" and all that. I didn't blame them. The lights went out again. OK, maybe we should all just go home and forget about work. Except that by now we'd heard that the power outage was affecting other buildings, maybe more of the city. Few of us drive, and nobody wants to get stuck on a train without power. Finally, lights back on, and it looks like maybe this time it's for real. I get back to my desk. Hm, my phone is going nuts with pages complaining about the sites being down. I get a call. Power went out at the data center and didn't come back. Um, don't we pay them a jillion dollars because they have multiple redundant power systems with diesel generators and batteries and so forth? Yes. Yes we do. So what happened? Nobody knows yet. Power's on in the city but the data center is only getting 50% of their juice. Half our machines aren't starting. The other half are freaking out because they depend on certain machines to function, and those are the ones that are down (of course). The colo is running on generator now and getting back online with the municipal grid. It's getting to the point where there's enough functioning that I can take stock of the situation and see about getting the sites going again. Databases have been corrupted (they don't like it when you pull the plug in the middle of a write), replication is failing, key services which are supposed to auto-restart didn't. What started out as a 15 minute mild diversion in the work day is turning into a major project. Several hours later I've got all the wreckage sorted. I'm tired. Just time to watch one episode of Entourage and go to bed.