“Is that something we can change? We have friends in the White House now!”
— Anna Wintour, discovering that price-fixing and collusion is illegalOkay, I’m stumped. We have a ReadyNAS NV+ that holds Important Data, accessed primarily from Windows machines. Generally, it works really well, and we’ve been pretty happy with it for the last few months.
Monday, the Windows application that reads and writes the Important Data locked up on the primary user’s machine. Cryptic error messages that decrypted to “contact service for recovering your corrupted database” were seen.
Nightly backups of the device via the CIFS protocol worked fine. Reading and writing to the NAS from a Mac via CIFS worked fine. A second Windows machine equipped with the application worked fine, without any errors about corrupted data. I left the user working on that machine for the day, and did some after-hours testing that night.
The obvious conclusion was that the crufty old HP on the user’s desk was the problem (it had been moved on Friday), so I yanked it out of the way and temporarily replaced it with the other, working Windows box.
It didn’t work. I checked all the network connections, and everything looked fine. I took the working machine back to its original location, and it didn’t work any more. I took it down to the same switch as the NAS, and it didn’t work. My Mac still worked fine, though, so I used it to copy all of the Important Data from the ReadyNAS to our NetApp.
Mounting the NetApp worked fine on all machines in all locations. I can’t leave the data there long-term (in addition to being Important, it’s also Confidential), but at least we’re back in business.
I’m stumped. Right now, I’ve got a Mac and a Windows machine plugged into the same desktop gigabit switch (gigabit NICs everywhere), and the Mac copies a 50MB folder from the NAS in a few seconds, while the Windows machine gives up after a few minutes with a timeout error. The NAS reports:
The only actual hardware problem I ever found was a loose cable in the office where the working Windows box was located.
[Update: It’s being caused by an as-yet-unidentified device on the network. Consider the results of my latest test: if I run XP under Parallels on my Mac in shared (NAT) networking mode, it works fine; in bridged mode, it fails exactly like a real Windows box. Something on the subnet is passing out bad data that Samba clients ignore but real Windows machines obey. The NetApp works because it uses licensed Microsoft networking code instead of Samba.]
[8/23 Update: A number of recommended fixes have failed to either track down the offending machine or resolve the problem. The fact that it comes and goes is more support for the “single bad host” theory, but it’s hard to diagnose when you can’t run your tools directly on the NAS.
So I reached for a bigger hammer: I grabbed one of my old Shuttles that I’ve been testing OpenBSD configurations on, threw in a second NIC, configured it as an ethernet bridge, and stuck it in front of the NAS. That gave me an invisible network tap that could see all of the traffic going to the NAS, and also the ability to filter any traffic I didn’t like.
Just for fun, the first thing I did was turn on the bridge’s “blocknonip” option, to force Windows to use TCP to connect. And the problem went away. I still need to find the naughty host, but now I can do it without angry users breathing down my neck.]
Windows Vista really likes IPv6 (even tunneling it over IPv4 for you, quietly bypassing your NAT firewall). Outlook 2007 also likes IPv6, and if it’s available, will always try to use it to connect to an Exchange server.
We don’t have an IPv6 infrastructure. One of our wireless access points was configured to hand out IPv6 addresses. Connect the dots.
Yesterday, a user’s VAIO BX640 dropped dead in the middle of a meeting. It didn’t come back, and by that I mean “nothing happens when you press the power button”. After swapping in different battery and power supply, I called for service.
This afternoon, another user reported that he wasn’t getting sound out of his BX640, and the headphone jack just made ticking noises. It doesn’t even make the magical VAIO noise when you power it on. I swapped parts around, reset the BIOS, etc. No luck. This isn’t a critical issue, so I’ll wait until Monday to ship it off for service, but it’s disturbing, because they’re both motherboard problems. And so was the only other one of my (more than a dozen) BX640s to fail so far, several months ago…
Judging from the first three pages, I’d say the editor:
As their eyes grew accustomed to the lack of light, they were drawn upward to the strangest feature of the scene...
[and, no, I didn’t wait in line last night; I fought past the rodeo crowds to get to Costco this morning to buy steak and garlic bread, and found a giant pile of Potters at the end of an aisle. As expected.]
[I’ll read it tomorrow, perhaps]
Once there was enough caffeine in my system, I remembered the first rule of system administration, and carefully reread the twice-forwarded email. Thanks, Walt; if you hadn’t passed on that key detail, we’d still be looking in the wrong place.
Oh, the rule? “Never let the user diagnose the problem.”
I’m playing with my old Sony XG-19 again. As reported earlier, OpenBSD 4.1 worked but never played DVDs, Fedora 7 blew chunks during the install, and Debian 4.0 worked fine, requiring only a few xorg.conf tweaks and a copy of libdvdcss2.
But it sucked for Japanese, so it had to go. There are all sorts of input managers and applications available, but they don’t all play nice with each other, and the system setup assumes that anyone who wants to type in Japanese wants a completely localized system. You can work around this, eventually, but I lost patience.
So I tried CentOS 5. The graphical install worked fine, the xorg.conf file only needed a one-line change to shut off double-tapping on the trackpad, and once you find DAG, it’s easy to get DVDs playing with VLC (Totem steadfastly refuses to admit which of its plugins are missing, and nothing I install seems to placate it, but who cares?).
The Japanese support in CentOS is much more mature, and offers a user experience reasonably close to Mac OS X or Windows. The default keybindings are naturally different from anything you’ve ever used before, but one has to make some concessions when dealing with Open Source, and it has a “behave like Windows” option.
Now to build the current version of Claws Mail…
[Update: got Claws 2.10 built and running, and unlike my Debian install, it plays nice with the Japanese input method.]
Our product is no longer a secret. Most of the tech blogs and news sites have something up today, although the quality of information varies. I won’t be commenting on it here much.