“If you call that interfering, there’s something wrong with your Funk ‘n’ Wagnalls.”

— Buck Rogers in the 25th Century, deftly evading the primitive censorship of 1979 television

Parsing Japanese with MeCab


This is a public braindump, to help out anyone who might want to parse Japanese text without being sufficiently fluent in Japanese to read technical documentation. Our weapon of choice will be the morphological analysis tool MeCab, and for sanity’s sake, we’ll do everything in UTF8-encoded Unicode.

The goal will be to take a plain text file containing Japanese text, one paragraph per line, and extract every word in its dictionary form, with the correct reading, with little or no manual intervention.

more...

Asobou!


I am compelled to make the following observations about the first novel in the Asobi ni iku yo! series.

  1. The series title is given a wonderful Engrish translation as furigana: “Us It goes to play in Your house”.

  2. The compound noun 食料合成機 (literally “food synthesis machine”) has the following pronunciation as furigana: ソイレント・グリーン.

For the kana-impaired, instead of shokuryou-gouseiki, it’s to be read as soirento gureen. Not having seen the anime (yet), I do not know if this joke was carried over.

Dear Google,


Auto-correcting my already-correct spelling of a search (even when I’m refining a search by adding additional keywords after already overriding your miscorrection) is annoying, but auto-correcting it to something that you have no search results for at all? STUPID.

The most reliable miscorrection I’ve found is the Japanese version of the LaTeX document processing system, which goes by the name “pLaTeX”. This is always corrected to “playtex”, even if the other words in the search are so specific that there are no Playtex associations possible, such as 傍点, the marginal dots that are used for emphasis in Japanese text.

Searches for LaTeX often include rubber gloves and fetish items, but the TeX community has been online for so long that they don’t dominate. In fact, I suspect horny rubber-lovers are often frustrated to find themselves receiving advice on pagination, hyphenation, and “how to make your font bigger”.

Dear Amazon,


Because I bought a Kindle, your recommendation system now fills the first several pages of results with random ebooks that either I already own in print, or else are not even plausibly related to anything I’ve ever purchased, owned, or searched for.

"Hey, you need to fill your Kindle with books! This is a book! It has a cover and a title page and words inside! You like words, right? Of course you do!"

Here’s one of the least stupid suggestions:

Things that go bump in your kitchen

Because I bought Cooks Illustrated’s Italian Favorites, I really, really want to read about a monster-hunter who’s in over her head.

Other Kindle-fied “recommendations” include one called Blink, subtitled “The power of thinking without thinking”, because I own Beard on Food.

I can’t blame it all on the Kindle, though; that 750GB laptop drive I just bought led to a recommendation for a Gillette single-blade disposable razor. And there are some actual relevant recommendations, such as Shogun because I bought Exploring Kyoto, and James Beard’s New Fish Cookery because of the aforementioned Beard On Food.

And I really can’t complain about the DVD of Xanadu, recommended because my wishlist contains the Flash Gordon Blu-ray release. That’s just common sense.

Defense Against The Dog Arts


Unrelated: A shrine maiden, a buddhist nun, and a “catholic” nun walk into a bar, and…

No, wait, that’s not a bar, it’s a porn novel. My mistake.

Dear Microsoft,


I replaced my secondary hard drive over the weekend. Today I discover that Microsoft Office 2011 is demanding an activation key. Not “you need to go online to reactivate”, but rather “you can no longer use this product until you drive home, find the box, and re-enter the key”.

Permit me to describe my feelings about this.

I’ll keep it simple.

fuckyoufuckyoufuckyou

Microsoft Arc Touch Mouse


This thing. Travel version of the Arc Mouse. Replaces middle button/wheel with solid-state slide control that includes scroll, page up/down, and middle-click.

Except I lied there. It doesn’t actually support middle click. In his infinite wisdom, designer Young Kim made a click at the top of the strip, where your finger naturally falls, send a page-up keystroke. Clicking at the bottom of the strip sends a page-down.

Clicking the middle of the strip does nothing at all. You double-click the middle of the strip to generate a single middle-click. Middle-click-and-hold activates the annoying drag-scroll mode that I’ve never seen anyone use deliberately. Usually they end up trying to figure out why their mouse stopped working normally.

And why do I know the designer’s name? Because the only two things on the product support web site are an interview with him and a “lifestyle video”.

And why did I buy one? Because the last several MS mice I’ve bought had poorly-engineered scrollwheels that simply stopped working after a while, and I thought the solid-state version might be a step up. It looked nice at the company store, and didn’t suffer from the same heavy-spring problem that the right mouse button on the standard Arc has.

So, if you’re one of those two-buttons-is-enough Windows people, and you don’t mind risking the loss of the little USB dongle (held in place on the completely-flat underside of the mouse by a strong magnet), it looks like an excellent lifestyle accessory, and a decent mouse.

I am intrigued by her ideas...


…and wish to subscribe to her newsletter.

Ootani Masae: buy my singles or else!

In addition to releasing indie singles as “Himawari”, Masae Ootani has been getting some decent theatre roles recently. This one looks like it makes good use of her distinctive style. Honestly, except for the sword, it’s like she just walked out of her apartment. Maybe she leaves it at home; Tokyo cops are so sensitive about that sort of thing.

“Need a clue, take a clue,
 got a clue, leave a clue”