Sysadmin

435,265


That’s the number of emails sent out this morning by a test service that was getting pummeled by an automated QA script.

Mood: Cranky.

[Update: after many eyes explored the logs, the QA test script was found to have done exactly the right thing, and the bug was in the actual service. So, a big huzzah for catching a truly crippling bug before it reached Production, but damn that was a mess.]

ESXi 5.1 on Dell


We bought a Dell R620 to run VMware ESXi 5.1U1. It was pre-configured to correctly boot the supplied ESXi image from an SD card. Bringing it up on the network was trivial. Downloading the Windows vSphere Client software was trivial. Configuring a datastore so that you could actually use the product was annoying.

Y’see, they shipped it with a Windows GPT partition table, and attempting to use the disk produced a lengthy timeout and disconnect, every time. Occasionally, I’d get a pop-up error message, but couldn’t select it to cut and paste, and enabling ssh on the server showed that no errors were being logged.

Typing the error message in by hand (“… HostDatastoreSystem.QueryVmfsDatastoreCreateOptions … failed”) and googling it turned up detailed solutions for the problem, with obsolete commands. So, for the benefit of anyone else who gets into this state on ESXi 5.1:

  1. Enable ssh, and log in as root.
  2. Run esxcli storage core path list, locate your disk by the display name that showed up in the vSphere Client, and save the contents of the Device: field (mine was naa.6b8ca3a0e8405800195f77a21641467c).
  3. run partedUtil mklabel /vmfs/devices/disks/device msdos

Now you can use it as a datastore.

Three hours of my life I want back…


“No, we just moved our office, we didn’t change anything except the external IP address. The VPN problem must be on your end. Did you set the new IP address?”.

“Okay, we did install a new NAT router. But the problem must be on your end. Did you set the new IP address?”

“Oh, yes, it’s running a newer version of the OS. But the problem must be on your end. Did you set the new IP address?”

“Here are screenshots of our config. But the problem must be on your end. Did you set the new IP address?”

“Yes, we set it up with IKEv2 instead of v1. But the problem must be on your end. Did you set the new IP address?”

It’s actually been more than eight hours, and they still haven’t fixed their problem, but I at least got some sleep in the middle. We’d still be arguing about what the problem actually is if they hadn’t sent me the screenshots.

Oh, and it was urgent for me to make the change on my end Friday night (which they told me about on Friday afternoon…), but no one at their end actually checked their router for connectivity until this morning. And it’s been nearly an hour since they responded to the message that they’re using the wrong IKE version, but they still haven’t fixed it.

[Update: to add insult to injury, I just got a recruiting email from WalmartLabs. Perhaps the fact that it’s raining in Northern California in late June should have been a clue that the week was going to be a little odd.]

Important NTP safety tip


When the clocks on internal hosts are drifting out of sync despite the fact that everything runs NTP, make sure that the server everything is pointed at isn’t pointed at itself.

(unless it has an attached GPS or other source of correct time, of course, which this one didn’t)

Dear users,


When you detect that an incoming email contains a virus-infected attachment, please do not forward the virus-infected message to other people saying “hey, if you got this, don’t open it”.

Followup questions are important…


User: “Help! I can’t find some of the files I need on the server for this morning’s meeting.”

Sysadmin: “Okay, that server looks fine, and we have good backups. What folders are missing files?”

User: “Well, I was looking in the agendas folder, and then it was gone, and there was a porn folder, and a sexy pictures folder, and…”

Sysadmin: “That sounds a little more serious than missing files. We’re on our way.”

EnGenius ENH202 Wireless Bridge


Good news: the building we’re moving into has never been occupied by another company. Bad news: it’s never been occupied by another company. In other words, there isn’t a single incoming network cable of any kind, and the few people willing to wire the place up are all running a bit behind schedule. If we had something better than a 3G modem, we could at least move a few people over there early, but so far nobody’s delivered. (…and a firmly-extended middle finger to Comcast, who offered us a great deal and then tried to get us to pay more than $10,000 to extend their network so it could reach the building)

Fortunately, the new place isn’t that far from the old place, and even more fortunately, the EnGenius ENH202 is trivial to configure and costs less than $100. And unlike the $300+ wireless bridge we tried, it actually powers up when you plug it in!

And it works quite nicely so far. No serious environmental sealing, so in a long-term installation you’d want to cover it in some fashion, but we’ll be happy if it lasts through Christmas.

[Update: damn this thing worked out nicely, making the move a lot less painful.]

Dear APC,


When someone plugs a serial cable into one of your commercial-grade UPS units, the correct response is not to shut the unit off, interrupting power to the expensive device that’s being protected by it.

“Need a clue, take a clue,
 got a clue, leave a clue”