"I am just mystified at the idea of having preferred pronouns. It's like having a favorite algae."

— Scott P., from the comments on Language Log

Dea Apple,


I caefully tested out the funky keyboad on the MacBook befoe buying one, but after seveal months, I’m eady to send mine in fo sevice. It seems thee’s a poblem with the ‘’ key not eliably egisteing keypesses. This is not a ecent poblem, but one that’s been botheing me since the day I unpacked it.

At fist, I thought I was just having touble adjusting to the key action, but it’s just the ‘’ key. All of the othes work fine evey time, but the ‘’ only woks about 60% of the time.

Unless I press really had, and then it’s still only about 90%. The annoying thing is that I eplaced the stock AM and had dive with bette stuff, and now I have to swap the oiginals back in, o AppleCae won’t touch it. Fotunately, that’s easy to do.

Kanji cross-referencing


Jim Breen’s KANJIDIC includes cross-references for various printed kanji dictionaries, and KANJIDIC2 adds more. I’ve imported KANJIDIC into a SQLite database for use by my Perl scripts, and it’s quite handy (and much faster than repeatedly slurping in the original file and parsing it…).

However, it’s missing two cross-reference indexes that would be quite useful for me: JLPT level and White Rabbit Kanji Flashcards card number.

Most of the online JLPT references predate the 2002 test specifications, so the only reliable source I’ve found is The JLPT Study Page. The creator of that site is working from the latest edition of the test content specs, so apart from the occasional typo in the vocabulary, it’s solid data. It just wasn’t in a form directly useful to me, so I screen-scraped it and generated a simple text file, UTF-8 encoded.

The White Rabbit folks have an online lookup tool so you can generate your own cross-reference lists, but by the time I’d found it, I’d already read the forum article that explains their numbering scheme: Unicode sort order within JLPT level. A few seconds at the shell, and I had another simple text file (extended to include the planned Level 1 card set).

Open Egos


Just lean back and inhale the fumes:

"What does Firefox have to do with social justice? How will the one laptop per child project discourage genocide? How soon will Microsoft collapse? Watch Eben Moglen's inspiring keynote from the 2006 Plone Conference.

'If we know that what we are trying to accomplish is the spread of justice and social equality through the universalization of access to knowledge; If we know that what we are trying to do is build an economy of sharing which will rival the economies of ownership at every point where they directly compete; If we know that we are doing this as an alternative to coercive redistribution, that we have a third way in our hands for dealing with long and deep problems of human injustice; If we are conscious of what we have and know what we are trying to accomplish, when this is the moment for the first time in lifetimes, we can get it done.'"

Eiken for more?


What does the artist who created the Eiken manga do for an encore? Zokusei, with the thoroughly-anonymous 〇〇くん guiding the “reader” through first-person fanservice-y encounters with every bishoujo cliché in the book. He still loves supsersizing, but unlike Eiken (the anime, at least; I avoided the manga…), the things attached to the girls’ chests are probably breasts, and some of their figures are not alien to this species. Being manga, you’re also spared the sloshing-mudsack animation that helped make the anime completely unwatchable. Their eyes tend toward the psychotic, but other than that, most of the art’s actually not bad.

Okay, I wouldn’t have bought it if I’d noticed the small print at the bottom that said “from the creator of Eiken”. I thought one of the girls on the cover looked cute, it was only $5, and it promised 「美少女20人大集結!!」.

Each chapter is devoted to showing off a different girl, with just enough story to accurately classify her (tsundere, ojou, meganekko, iincho, Yankee, American, etc). All of them have names, ages, blood types, heights, weights, and measurements. For educational purposes, I’ve calculated the average statistics of the girls, excluding the 5 teachers and the little sister: 16 years old, 5’3”, 101 pounds, 34E-22-33. Note that the mean cup size is skewed by three mutants: 2 I’s and a J; the rest average an overstuffed C. The five teachers average 39H, a sure sign that this comic is set on a low-G planet.

Oh, and it has furigana, so I can officially consider it study material.

Well, now, there's a bit of a surprise


From the nice folks at the National Weather Service, via the new Weather.com Dashboard widget:

259 PM PST FRI NOV 24 2006 ...Frost advisory in effect from 2 AM to 8 AM PST Saturday...

The National Weather Service in San Francisco has issued a frost advisory…which is in effect from 2 AM to 8 AM PST Saturday.

Further drying was expecrienced today as dew points continued to drop across much of the Salinas Valley. Relatively clear skies tonight combined with low dew points will allow overnight lows to tumble into the upper 20s and lower 30s. Areas of frost will likely develop as several hours of near freezing temperatures are experienced.

A frost advisory means that frost is possible. Sensitive outdoor plants may be killed if left uncovered.

JMdict + XML::Twig + DBD::SQLite = ?


In theory, I’m still working at Digeo. In practice, not so much. As we wind our way closer to the layoff date, I have less and less actual work to do, and more and more “anticipating the future failures of our replacements”. On the bright side, I’ve had a lot of time to study Japanese and prepare for Level 3 of the JLPT, which is next weekend.

I’m easily sidetracked, though, and the latest side project is importing the freely-distributed JMdict/EDICT and KANJIDIC dictionaries into a database and wrapping it with Perl, so that I can more easily incorporate them into my PDF-generating scripts.

Unfortunately, all of the tree-based XML parsing libraries for Perl create massive memory structures (I killed the script after it got past a gig), and the stream-based ones don’t make a lot of sense to me. XML::Twig‘s documentation is oriented toward transforming XML rather than importing it, but it’s capable of doing the right thing without writing ridiculously unPerly code:

my $twig = new XML::Twig(
        twig_handlers => { entry => &parse_entry });
$twig->parsefile('JMdict');
 
sub parse_entry {
        my $ref = $_[1]->simplify;
        print "Entry ID=",$ref->{ent_seq},"\n";
        #...
        $_[1]->delete;
}

SQLite was the obvious choice for a back-end database. It’s fast, free, stable, serverless, and Apple’s supporting it as part of Core Data.

Putting it all together meant finally learning how to read XML DTDs, getting a crash course in SQL database design to lay out the tables in a sensible way, and writing useful and efficient queries. I’m still working on that last part, but I’ve gotten far enough that my lookup tool has basic functionality: given a complete or partial search term in kanji, kana, or English, it returns the key parts of the matching entries. Getting all of the available data assembled requires both joins and multiple queries, which is tedious to sort out.

I started with JMdict, which is a lot bigger and more complicated, so importing KANJIDIC2 is going to be easy when I get around to it. That won’t be this week, though, because the JLPT comes but once a year, and finals week comes right after.

[side note: turning off auto-commit and manually committing after every 500th entry cut the import time by 2/3]

Reading for comprehension


It’s not a bad collection of sci-fi babes, but I’m not the only one who choked on this line about Heinlein’s Starship Troopers:

One of the main plot points of Heinlein’s original novel was that all the experienced officers were killed off, leaving only the kids in charge.

What are they teaching kids in school these days?

Truth in advertising


Spotted this at Border’s today. I like products that match their descriptions…

Lighted Magnifier

The name is effective, though. When I pointed it out to Jeff so he could laugh at it too, the woman in line behind him asked to see it, and ended up buying one.

It’s probably nice, but I think the Zelco Lumifier is better for carrying around. It’s my furigana tool.

“Need a clue, take a clue,
 got a clue, leave a clue”