“I’d be delighted to live in a country where happily married gay couples had closets full of assault weapons.”

— Instapundit

Summer Princess


Mikako Takahashi is one of many voice actresses (Kasumi Tani from Hand Maid May, Rushuna from Grenadier, etc) who also sings. I’m rather fond of her ED song from HMM (Honto no Kimochi, “My True Feelings”), so when I noticed that she released two albums last year, I added them to my list for a future purchase from Amazon Japan.

I finally got around to buying them, and while my initial impression of the songs is mixed, I can find nothing to dislike about the pictures…

Especially the red dress and the bikini…

more...

Words of Wisdom...


…sort of. So speaks Izumi Kojima, in the song “Ah ~ yokatta”:

There is nothing for us to lose.
Sure, I can say. I can say.
Nobody knows what it means,
"Hung in there!"
But I'll be right beside you from now on.
So on...

You're doing it wrong...


I mean, come on:

"Dear Slashdot, how do I gain social skills?"

How to make an old cellphone sexy...


Hand it to Han Ga Eun (한가은)…

more...

Using Abbyy FineReader Pro for Japanese OCR


[Update: if you save your work in the Finereader-specific format, then changes you make after that point will automatically be saved when you exit the application; this is not clear from the documentation, and while it’s usually what you want, it may lead to unpleasant surprises if you decide to abandon changes made during that session.]

After several days with the free demo, in which I tested it with sources of varying quality and tinkered with the options, I bought a license for FineReader Pro 9.0 (at the competitive upgrade price for anyone who owns any OCR product). I then spent a merry evening working through an album where the liner notes were printed at various angles on a colored, patterned background. Comments follow:

  • Turn off all the auto features when working with Japanese text.
  • In the advanced options, disable all target fonts for font-matching except MS Mincho and Times New Roman. Don't let it export as MS Gothic; you'll never find all of the ー/一 errors.
  • Get the cleanest 600-dpi scan you can. This is sufficient for furigana-sized text on a white background.
  • Set the target language to Japanese-only if your source is noisy or you're sure there's no random English in the text. Otherwise, it's safe to leave English turned on.
  • Manually split and deskew pages if the separation isn't clean in the scan.
  • Adjust the apparent resolution of scans to set the output font size, before you tell it to recognize the text.
  • Manually draw recognition areas if there's anything unusual about your layout.
  • Rearrange the windows to put the scan and the recognized text side-by-side.
  • Don't bother with the spell-checker; it offers plausible alternative characters based on shape, but if the correct choice isn't there, you have to correct it in the main window anyway. Just right-click as you work through the document to see the same data in context.
  • You can explicitly save in a FineReader-specific format that preserves the entire state of your work, but it creates a new bundle each time, and it won't overwrite an existing one with the same name. This makes it very annoying when you want to simply save your progress as you work through a long document; each new save includes a complete copy of the scans, which adds up fast.
  • If you figure out how to get it stop deleting every full-width kanji whitespace character, let me know; it's damned annoying when you're trying to preserve the layout of a song.
  • Once you've told it to recognize the text, search the entire document for these common errors:
    • っ interpreted as つ and vice-versa
    • ー interpreted as 一 and vice-versa; check all other nearby katakana for "small-x as x" errrors while you're at it
    • 日 interpreted as 曰
    • Any English-style punctuation other than "!", ":", "…", or "?"; most likely, they should be the katakana center-dot, but it might have torn a character apart into random fragments (rare, unless your background is noisy).
    • The digits 0-9; if your source is noisy, random kanji and kana can be interpreted as digits, even when English recognition is disabled.
  • Delete any furigana it happens to recognize, unless you're exporting to PDF; it just makes a mess in Word.
  • In general, export to Word as Formatted Text, with the "Keep line breaks" and "Highlight uncertain characters" options turned on.
  • If your text is on a halftoned background and you're getting a lot of errors, load up the scan in Photoshop, use the Strong Contrast setting in Curves, then try out the various settings under Black & White until you find one that gets rid of most of the remaining halftone dots (I had good luck with Neutral Density). After that, you can Despeckle to get rid of most of the remaining noise, and use Curves again to force the text to a solid black.

Microsoft Bluetooth Notebook Mouse 5000 review, Bad Haiku Edition


Button 2 broke fast;
replacement eats batteries.
Gosh it's pretty, though.

Samurai Tut


Need something to do in San Francisco?

  • King Tut, de Young Museum, June 27, 2009 through March 28, 2010
  • Lords of the Samurai, Asian Art Museum, June 12 through September 20, 2009

Sigh


This is just… sad.

Not so much because there’s a cartoonist who’d draw it, as because I’m seeing it linked approvingly by people who have access to wet matches, with which they can obviously no longer be trusted.

“Need a clue, take a clue,
 got a clue, leave a clue”