Monday, October 15 2007

Sony Reader 505

So I bought the second-generation Sony Reader. Thinner, faster, crisper screen, cleaned-up UI, USB2 mass storage for easy import, and some other improvements over the previous one. It still has serious limitations, and in a year or two it will be outclassed at half the price, but I actually have a real use for a book-sized e-ink reader right now: I’m finally going to Japan, and we’ll be playing tourist.

My plan is to dump any and all interesting information onto the Reader, and not have to keep track of travel books, maps, etc. It has native support for TXT, PDF, PNG, and JPG, and there are free tools for converting and resizing many other formats.

Letter and A4-sized PDFs are generally hard to read, but I have lots of experience creating my own custom-sized stuff with PDF::API2::Lite, so that’s no trouble at all. The PDF viewer has no zoom, but the picture viewer does, so I’ll be dusting off my GhostScript-based pdf2png script for maps and other one-page documents that need to be zoomed.

I’ll write up a detailed review soon, but so far there’s only one real annoyance: very limited kanji support. None at all in the book menus, which didn’t surprise me, and silent failure in the PDF viewer, which did. Basically, any embedded font in a PDF file is limited to 512 characters; if it has more, characters in that font simply won’t appear in the document at all.

The English Wikipedia and similar sites tend to work fine, because a single document will only have a few words in Japanese. That’s fine for the trip, but now that I’ve got the thing, I want to put some reference material on it. I have a script that pulls data from EDICT and KANJIDIC and generates a PDF kanji dictionary with useful vocabulary, but I can’t use it on the Reader.

…unless I embed multiple copies of the same font, and keep track of how many characters I’ve used from each one. This turns out to be trivial with PDF::API2::Lite, but it does significantly increase the size of the output file, and I can’t clean it up in Acrobat Distiller, because that application correctly collapses the duplicates down to one embedded font.

I haven’t checked to see if the native Librie format handles font-embedding properly. I’ll have to install the free Windows software at some point and give it a try.

[Update: I couldn’t persuade Distiller to leave the multiple copies of the font alone, because OpenType CID fonts apparently embed a unique ID in several places. FontForge was perfectly happy to convert it to a non-CID TrueType font, and then I only had to rename one string to create distinct fonts for embedding. My test PDF works fine on the Reader now.]