Tuesday, March 1 2011

How did I miss this?

Donna Barr is putting both Stinz and The Desert Peach online.

Stinz is still in issue 1, before the war, but the Peach is all the way up to issue 21.

Lots of good stuff, but watching The Desert Fox hang ten is still one of my favorite bits.

Friday, March 4 2011

Well, at least it’s got a catchy title

Can’t go wrong with a title like “Regarding Ducks and Universes”, even when a quick inspection reveals that it’s a first novel published through Amazon’s vaguely-described Encore program.

I’m not recommending it, mind you, and I’m not even using my affiliate code in that link. I just found it interesting that Amazon is aggressively promoting an SF title by a complete unknown, as opposed to the usual “Kindle vanity press” or POD semi-publishing approaches.

Tuesday, March 8 2011

Notes on finishing a novel

A novel in Japanese, that is, converted into a custom “student edition” at precisely my reading level, as described previously.

  1. Speed and comprehension are good; once I resolved the worst typos, parsing errors, and bugs in my scripts, I was able to read at a comfortable pace with only occasional confusion. Words that didn’t get looked up correctly are generally isolated and easy to work out from context, and most of the cases where I had to stop and read a sentence several times turned out to be odd because the thing they were describing was odd (such as what the guard does before allowing the original Kino to enter the city in Natural Rights). Of course, it helps to have a general knowledge of the material.
  2. Coliseum was changed significantly for the animated version of Kino’s Journey; the original story leaves most of the opponents shallow and one-dimensional, and spends way too much time on the mechanical details of Kino’s surprise (both the preparation the night before, and the detailed description of the physical impact and aftermath). Mother’s Love, on the other hand, is a pretty straight adaptation.
  3. Casual speech and dialect don’t cause as much problem as you might expect. MeCab handles a lot of the common ones, and recovers well from the ones it has to punt on. They didn’t confuse me too often, either. After a while. :-)
  4. One thing that MeCab sometimes gets wrong is when a writer uses pre-masu form instead of te-form when listing a series of actions. I don’t have a good example at the moment, but I ran into several where it punted and looked for a noun.
  5. The groups that scan, OCR, and proofread novels tend to miss some simple errors where the software guessed the wrong kanji. A good example is writing 兵士 as 兵土 or 兵上. Light novels generally aren’t that complicated, and if a word looks rare or out of place, it may well be an OCR error.
  6. The IPA dictionary used by MeCab has some quirks that make it sub-optimal for use with modern fiction. Reading 空く as あく, 他 as た, 一寸 as いっすん, 間 as ま, 縁 as えん, and 身体 as しんたい are all correct sometimes, but not in some common contexts where their Ipadic priority causes MeCab to guess wrong. Worse, it has a number of relatively high-priority entries that are not in any dictionary I’ve found: 台詞 as だいし, 胡坐 as こざ, 面す and 脱す as verbs that are more common than 面する and 脱する, etc. It also has no entries for みぞれ, 呆ける, 粘度, 街路樹, and a bunch of others. Oddest of all, there are occasions where it reads 達 as いたる; this is a valid name reading, but name+達 is far more likely to be たち than いたる; some quirk of how it calculates the appropriate left/right contexts when evaluating alternatives, an aspect of the dictionary files that I definitely don’t understand.
  7. I need to make better use of the original furigana when evaluating MeCab output. I’m preserving it, but not using it to automatically detect some of the above errors. Mostly because I don’t want the scripts to become too interactive. Perhaps just flagging the major differences is sufficient.
  8. On to book 2!

Wednesday, March 9 2011

Competitive advantage

Merrill reflects on Ila’s qualifications….

(Continued on Page 3741)

Dear Wisconsin Democrats,

This is not “what democracy looks like”, this is what a temper tantrum looks like. If you wanted democracy, you should have spent the last three weeks hounding your senators to stop hiding out in hotels and go back to their jobs.

Friday, March 11 2011

Ouch! Massive quake and tsunami hit Japan

Magnitude 8.9 off the coast near Sendai, with many significant aftershocks. USGS reports that there was a 7.2 magnitude quake in the same area two days ago, which had three 6+ magnitude aftershocks.

[Update: Brickmuppet has some disturbing details; coastal trains just “missing”, a city of 77,000 wiped off the map, etc. So far the best news I’ve seen is that the nuclear power plant that didn’t have enough coolant for a safe shutdown has been resupplied by air by the US Air Force (no, Hillary was talking out of her ass again).]

[Update: The American Red Cross doesn’t have a targeted donation page up yet for this disaster, but as reported by Reuters and elsewhere, they’ve set up an instant text-message donation system, and of course their standard international donation fund will be used to help out in Japan and elsewhere. I don’t see a way to contribute directly to the Japanese Red Cross on their site, but I’m sure we’ll find something when we arrive in Kyoto in two weeks.]

[Update: Wikipedia, Google Crisis Response pages]

[Update: Amazon is processing Red Cross payments through a prominently-displayed button on their home page. Amazon Japan has a letter on their home page redirecting to the Japan Red Cross donation site, which is currently a bit flaky.]

“Oh, joy, Tuf and Tuf again. I can hardly wait.”

George R. R. Martin’s Tuf Voyaging remains sadly out of print, but some small quantity of a relatively recent small-press edition are available directly from the author, autographed.

I made sure to place my order before mentioning this on my blog, just in case. My two paperback copies of the book are both starting to lose pages, and it’s an old favorite.

“I feel obliged to point out that a rather large carnivorous dinosaur has appeared in the corridor behind you, and is presently attempting to sneak up on us. He is not doing a very good job of it.”
– Haviland Tuf, Ecological Engineer

Saturday, March 12 2011

Revisiting Louie

Nearly three years ago, I had my first real success at reading Japanese prose written for a native audience. Getting through 30 pages of the first Rune Soldier Louie novel was a big accomplishment, given that I had to look up more than 600 new vocabulary words by painstakingly writing the kanji on my DS Lite or looking them up in printed dictionaries. It took nearly a month, an hour or two at a time.

That was before the demise of my group reading class, and my Japanese hasn’t improved very much since then. I’ve been treading water while waiting for Ooma to grow out of the startup lifestyle, and, yeah, that ain’t happened yet. My new scripts made it possible to read a complete novel in a reasonable time, but while the Rune Soldier novels have been scanned in, no one has gotten around to OCRing them. So I’m doing it.

  1. A ~1200x1800 PNG is adequate for Japanese OCR with Abbyy Finereader Pro (Windows only; the shiny new Mac App Store version does not include Japanese), but not great. It flags almost all of its possible errors, but there are maybe a dozen kanji per page that have to be checked, and the low resolution results in a number of small-kana errors and random guesswork.
  2. JPEG just sucks for OCR; I really wouldn’t want to proof a series that was only available as JPEGs.
  3. The scans for some series that haven’t been OCRd are only ~800x1200; even as PNG, those can’t be fun to OCR. Time to build a DIY Bookscanner!
  4. My scripts currently don’t handle oddball furigana well; in Rune Soldier, a number of ordinary words are given phonetically-written English readings, some quite long, and they create layout problems in pLaTeX.
  5. I need to figure out how to tell pLaTeX to break lines more aggressively; the small page size and tight margins of the Kindle means that a sloppy line break can leave an entire character offscreen; rare, but annoying.

That said, I successfully OCRd and proofed those same thirty pages that I read three years ago, ran them through my scripts, and read the story. It took about two hours to prep, and another two hours to read. I found some more errors that need correcting, but the first pass was perfectly readable.

I’ve also formatted and re-read Nishimura’s Ame no Naka ni Shinu, and the Kino stories Kioku no Kuni and Watashi no Kuni. I’m going to hold off on OCRing the rest of Rune Soldier 1 for a while, though, and focus on reading what I’ve got, which includes the second Kino novel and Tsutsui’s well-known Toki o Kakeru Shoujo. Oh, and I just remembered that copy of Kanjousen Pete typed in; that one’s already prepped for formatting.

Tuesday, March 15 2011

Disaster porn

We hates it.

Worse than we hates the mindless anti-nuke activists going “see? see?” and the morons from Left, Right, Center, and Alpha Centauri grinding their favorite axe and babbling about why Japan “deserved” this. Not that I plan to forget who signed their names to hatred that would shame a paid union agitator, I’m just busy reviling the news media hysteria peddlers at the moment.

Thursday, March 17 2011

Dear Apple,

Call me when I can do this on an iPad…

(Continued on Page 3747)

Monday, March 21 2011

“Run for the foothills!”

I honestly don’t know what to think about the Foothill College email newsletter leading off with a reassurance from the Santa Clara County Health Officer that there is currently no health threat from nuclear fallout here.

It is of course so artlessly phrased as to imply that Japan is now full of radioactive mutants wading knee-deep in the stuff.

[Update: Geiger counters have sold out in Paris. No, seriously.]

Saturday, March 26 2011

Dear Amazon,

Yeah, I got nothin’.

first person eater

Tuesday, March 29 2011

Thought for the day…

Reading Gruber’s opinions on Android is like reading a vegan review of roast beef. It’s clear that the mere thought of eating it makes him want to puke.

As far as I’m concerned, refurbished 1st-gen iPads are still about $200 over the price I’m willing to pay for such limited-by-design functionality, and I’m waiting to see if the Xoom and other Android tablets do better before I spend any money there. I haven’t bought into either ecosystem for a phone, either; my aging Blackberry handles work email effectively, and honestly does a better job as a phone.

Thursday, March 31 2011

Vacation substitute

The trip to Japan has been rebooked for late November, so no Kyoto cherry blossoms for us this year, but I know precisely how lovely Kansai is in Autumn, so we will certainly not be disappointed.

Unfortunately, this leaves me burned out and cranky, with no real alternative recovery plan. I have the usual three-free-nights offers in Vegas, but I don’t want casinos and crowds. California is finally warming up and drying up, so I could give my cameras some exercise at Point Lobos and other places, but there’s an air of been-there-done-that to all the nearby sightseeing opportunities, and they’re basically solo activities, where the Japan trip was built around sharing the experience with my sister.

Meanwhile, my 2002 Lexus had crossed the 280,000 mile mark, and despite its excellent health and promise of a long remaining lifetime, faced increasingly expensive service trips.

So I replaced it.

(Continued on Page 3754)