Sysadmin

"This -- is wrong tool. Never use this."


Today, my life is devoted to cleaning up after an old automated daily backup script that included a line like this:

tar cpf - . | (cd /mount/subdir; tar xpf -)

Guess what happens when /mount/subdir doesn’t exist? “Hey, why are all these files truncated to some multiple of 512 bytes in size? And why are they now owned by root?”

You've been a sysadmin too long when...


…while walking to the restroom in search of relief, you:

  1. spot a printer with a paper jam,
  2. fix the jam,
  3. wait to see if it's fixed,
  4. clear the second jam,
  5. diagnose the problem,
  6. solve the problem,
  7. reload the paper tray,
  8. verify that it's printing correctly.

Then you resume your trip to the restroom.

Dear netpbm maintainers,


I hope I am not the first to point out just how pompous and wrong-headed the following statement is:

In Netpbm, we believe that man pages, and the Nroff/Troff formats, are obsolete; that HTML and web browsers and the world wide web long ago replaced them as the best way to deliver documentation. However, documentation is useless when people don't know where it is. People are very accustomed to typing "man" to get information on a Unix program or library or file type, so in the standard Netpbm installation, we install a conventional man page for every command, library, and file type, but all it says is to use your web browser to look at the real documentation.

Translation: We maintain a suite of tools used by shell programmers, and we think that being able to read documentation offline or from the shell is stupid, so rather than maintain our documentation in a machine-readable format, we just wrote HTML and installed a bunch of “go fuck yourself” manpages.

On the bright side, they wrote their own replacement for the “man” command that uses Lynx to render their oh-so-spiffy documentation (assuming you’ve installed Lynx, of course), but they don’t even mention it in their fuck-you manpages. Oh, and the folks at darwinports didn’t know about this super-special tool, so they didn’t configure it in their netpbm install.

A-baka: “Hey, I know what we’ll do with our spare time! We can reinvent the wheel!”

B-baka: “Good idea, Dick! No one’s ever done that before, and everyone will praise us for its elegance and ideological purity, even though it’s incompatible with every other wheel-using device!”

A-baka: “We’re so cool!”

Update!: it keeps getting better. Many shell tools have some kind of help option that gives a brief usage summary. What do the Enlightened Beings responsible for netpbm put in theirs?

%  pnmcut --help
pnmcut: Use 'man pnmcut' for help.

Assholes.

Server dog slow today


I’m getting consistent 190ms pings to my server, despite 10ms pings to the router its connected to. It’s not server load, it’s not the bandwidth throttling rules in my firewall config, and I’m not seeing any errors in netstat or dmesg output. My best guess right now is a duplex mismatch on the switch. I’m waiting to hear back from the network guys.

Update: supporting evidence for my switch theory: Scott’s machine in the same rack, recycledbits.org, has the same problem, and I get 360ms pings from mine to his, without ever touching a router.

On the “damn nuisance” front, however, email to ViaNet tech support comes back with one of those stupid challenge/response verification schemes. This is precisely the wrong approach for your primary tech-support contact method. Maybe if you’d actually answered the phone when I called, I wouldn’t mind so much, but come on, grab a clue, eh?

Update: oh, that’s much better.

My need for fluff and fan-service


On Tuesday, a server we rely on that’s located in another state, under someone else’s control, went poof. They have another machine we can upload to, though, so I changed all references to point to it.

All the ones I knew about, that is. A little-used script in a particular branch of our software had a hardcoded reference to the dead host, which it used to download previous uploads to produce a small delta release. The result, of course, was a failure Wednesday that left the QA group twiddling their thumbs until I could fix things. In the end, other failures turned up that prevented them from getting the delta release, but they could live with a full release, and that’s what they got.

That was my day from about 7am to 2pm, not counting the repeated interruptions as I explained to people that the backup server we were uploading to had about half the bandwidth of the usual connection, so data was arriving more slowly.

Things proceeded normally for a few hours, until the next fire at 4:30pm. A server responsible for half a dozen test builds and two release builds had a sudden attack of amnesia, forgetting that a 200GB RAID volume was supposed to be full of data. A disk swap brought it back to life as a working empty volume, but by that time I’d moved all the builds to other machines. I’ll test it today before putting it back in service.

Just as I was finishing up with that mess and verifying that the builds would work in their new homes, our primary internal DNS/NIS server went down. The poor soul who’d just finished rebuilding my RAID volume had barely gotten back to his desk when he had to walk three blocks back to the data center. Once that machine was healthy again, I cleaned up some lock files so that test builds would resume, and waited for the email telling me what was supposed to be on the custom production CD-ROM they’re shipping overseas today.

That, of course, was IT’s cue to take down the mail server for maintenance. Planned and announced, of course, but also open-ended, so I had no idea when it would be back. Didn’t matter, though, because then my DSL line went down. I’d never made it out of the house, you see, and was doing all of this remotely.

The email I was waiting for went out at 9:30pm, I got it at 10:45pm, and kicked the custom build off at 11pm. It finished building at 12:30am and started the imaging process, which makes a quick query to the Perforce server.

Guess what went down for backups at 12am, blocking all requests until it completed at around 3am? Nap time for J!

At 4:45am I woke up, checked the image, mailed off a signing request so it could actually be used to boot a production box, set the alarm clock for 6:45am, and went back to sleep.

This was not a day for deep, thought-provoking anime. It was a day for Grenadier disc 2 and Maburaho disc 4 (which arrived from Anime Corner Store just about the time the mail server went down). I considered getting started on DearS disc 2 and Girls Bravo disc 3, which also showed up, but decided instead to make a badly-needed grocery run.

Adobe Version Cue 2: here we go again...


Apparently the folks at Adobe haven’t learned anything about computer security since I looked at the first release of Version Cue. After I installed the CS2 suite last night, I was annoyed at what I found.

Listens on all network interfaces by default? Check. Exposes configuration information on its web-administration page? Check. Defaults to trivial password on the web-admin page? Check. Actually prints the trivial default password on the web-admin page? Check. Defaults to sharing your documents with anyone who can connect to your machine? Check. I could go on, but it’s too depressing.

The only nice thing I can say about it is that it doesn’t add a new rule to the built-in Mac OS X firewall to open up the ports it uses. As a result, most people will be protected from this default stupidity.

Minolta Maxxum 7D glitch


[Update 7/23/05: okay, the rule of thumb seems to be, “if you can’t handhold a 50mm f/1.4 at ISO 100-400 and get the shot, spot-meter off a gray card and check the histogram before trusting the exposure meter”. This suggests some peculiarities in the low-light metering algorithm, which is supported by the fact that flash exposures are always dead-on, even in extremely dim light.]

[Update 7/22/05: after fiddling around with assorted settings, resetting the camera, and testing various lenses with a gray card, the camera’s behavior has changed. Now all the lenses are consistently underexposing by 2/3 of a stop. This is progress of a sort, since I can freely swap lenses and get excellent exposures… as long as I set +2/3 exposure compensation. I think my next step is going to be reapplying the firmware update. Sigh.]

The only flaw I’ve noticed in my 7D was what looked at first like a random failure in the white-balancing system. Sometimes, as I shot pictures around the house, the colors just came out wrong, and no adjustment seemed to fix it in-camera.

Tonight, I started seeing it consistently. I took a series of test shots (starting with the sake bottle, moving on to the stack of Pocky boxes…) at various white balance settings, loaded them into Photoshop, and tried to figure out what was going on. Somewhere in there, I hit the Auto Levels function, and suddenly realized that the damn thing was simply underexposing by 2/3 to 1 full stop.

Minolta has always been ahead of the curve at ambient-light exposure metering, which is probably why I didn’t think of that first. It just seemed more reasonable to blame a digital-specific feature than one that they’ve been refining for so many years.

With that figured out, I started writing up a bug report, going back over every step to provide a precise repeat-by. Firmware revision, lens, camera settings, test conditions, etc. I dug out my Maxxum 9 and Maxxum 7 and mounted the same lens, added a gray card to the scene, and even pulled out my Flash Meter V to record the guaranteed-correct exposure. All Minolta gear, all known to produce correct exposures.

Turns out it’s the lens. More precisely, my two variable-aperture zoom lenses exhibited the problem (24-105/3.5-4.5 D, 100-400/4.5-6.7 APO). The fixed focal-length lenses (50/1.4, 85/1.4, 200/2.8) and fixed-aperture “pro” zoom lenses (28-70/2.8, 80-200/2.8) worked just fine with the 7D, on the exact same scene. Manually selecting the correct exposure with the variable-aperture zooms worked as well.

These are the sort of details that make a customer service request useful to tech support. I know I’m always happier when I get them.

I always knew they were real


Everywhere I’ve worked, people believe in them. They’re the ones who clear jams, change toner cartridges, reload the paper trays, and clean up the messy pile of abandoned printouts, and finally they’ve been captured on film. I give you…

more...

“Need a clue, take a clue,
 got a clue, leave a clue”