About two and a half years ago, I threw together a set of Perl scripts that converted Japanese novels into nicely-formatted custom student editions with additional furigana and per-page vocabulary lists. I said I’d release it, but the code was pretty raw, the setup required hacking at various packages in ways I only half-remembered, and the output had some quirks. It was good enough for me to read nearly two dozen novels with decent comprehension, but not good enough to share.
When I ran out of AsoIku novels to read, I decided it was time to start over. I set fire to my toolchain, kept only snippets of the old code, and made it work without hacking on anyone else’s packages. Along the way, I switched to a much better parsing dictionary, significantly improved lookup of phrases and expressions, and made the process Unicode-clean from start to finish, with no odd detours through S-JIS.
Still some work to do (including that funny little thing called “documentation”…), but it makes much better books than the old one, and there are only a few old terrors left in the code. So now I’m sharing it.
https://github.com/jgreely/yomitori