Tools

The Guardian of my World


A while back, I mentioned that I was tinkering with jQuery for updating my pop-up furigana. This dovetails nicely with my attempts to improve my Japanese reading skills, which currently involve working my way through Breaking into Japanese Literature and ボクのセカイをまもるヒト.

The first one is a parallel text with all vocabulary translated on the same page. I wish he’d formatted it a bit differently, and my teacher isn’t pleased with some of the translation, but it’s a useful learning tool, and there’s a free companion audiobook on the web site.

The second is the first in a new light novel series from Nagaru Tanigawa, also responsible for The Melancholy of Haruhi Suzumiya, and it includes furigana for almost all of the kanji. My goal is to read it, not translate, but I have to look up an awful lot of vocabulary, and there’s not enough room on the page to annotate.

So I’m typing it in, and using a Perl script to add my shiny new pop-up furigana.

(and, yes, I’m deliberately over-annotating; I don’t actually need many of those annotations, but someone else might, and it’s not that much work)

[Update: I should mention that I’m using Jim Breen’s translation server to speed up the glossing process. The parser gets lost occasionally, but it’s still very helpful, often finding idiomatic phrases that cover several words.]

Oh, here’s the cover, courtesy of Amazon:

more...

Truth in advertising


Spotted this at Border’s today. I like products that match their descriptions…

Lighted Magnifier

The name is effective, though. When I pointed it out to Jeff so he could laugh at it too, the woman in line behind him asked to see it, and ended up buying one.

It’s probably nice, but I think the Zelco Lumifier is better for carrying around. It’s my furigana tool.

Pygments


This seems like a nice tool for syntax-coloring code. I rarely feel the need for this feature myself, but it’s nice when it works, and this one works a lot better than BBedit’s, although it shares a less-extreme version of the coloring bug I found when I was testing TextWrangler.

I don’t have nice things to say about the download/install process and documentation, though; it looks like the Python community is trying to come up with something similar to CPAN, but it doesn’t seem to be ready for general release yet.

TextWrangler


I want a better text editor. What I really, really want, I think, is Gnu-Emacs circa 1990, with Unicode support and a fairly basic Cocoa UI. What I’ve got now is the heavily-crufted modern Gnu-Emacs supplied with Mac OS X, running in Terminal.app, and TextEdit.app when I need to type kanji into a plain-text file.

So I’ve been trying out TextWrangler recently, whose virtues include being free and supporting a reasonable subset of Emacs key-bindings. Unfortunately, the default configuration is J-hostile, and a number of settings can’t be changed for the current document, only for future opens, and its many configuration options are “less than logically sorted”.

What don’t I like?

First, the “Documents Drawer” is a really stupid idea, and turning it off involves several checkboxes in different places. What’s it like? Tabbed browsing with invisible tabs; it’s possible to have half a dozen documents open in the same window, with no visual indication that closing that window will close them all, and the default “close” command does in fact close the window rather than a single document within it.

Next, I find the concept of a text editor that needs a “show invisibles” option nearly as repulsive as a “show invisibles” option that doesn’t actually show all of the invisible characters. Specifically, if you select the default Unicode encoding, a BOM character is silently inserted at the beginning of your file. “Show invisibles” won’t tell you; I had to use /usr/bin/od to figure out why my furiganizer was suddenly off by one character.

Configuring it to use the same flavor of Unicode as TextEdit and other standard Mac apps is easy once you find it in the preferences, but fixing damaged text files is a bit more work. TextWrangler won’t show you this invisible BOM character, and /usr/bin/file doesn’t differentiate between Unicode flavors. I’m glad I caught it early, before I had dozens of allegedly-text files with embedded 文字化け. The fix is to do a “save as…”, click the Options button in the dialog box, and select the correct encoding.

Basically, over the course of several days, I discovered that a substantial percentage of the default configuration settings either violated the principle of least surprise or just annoyed the living fuck out of me. I think I’ve got it into a “mostly harmless” state now, but the price was my goodwill; where I used to be lukewarm about the possibility of buying their higher-end editor, BBEdit, now I’m quite cool: what other unpleasant surprises have they got up their sleeves?

By contrast, I’m quite fond of their newest product, Yojimbo, a mostly-free-form information-hoarding utility. It was well worth the price, even with its current quirks and limitations.

Speaking of quirks, my TextWrangler explorations yielded a fun one. One of its many features, shared with BBEdit, is a flexible syntax-coloring scheme for programming languages. Many languages are supported by external modules, but Perl is built in, and their support for it is quite mature.

Unfortunately for anyone writing an external parser, Perl’s syntax evolved over time, and was subjected to some peculiar influences. I admit to doing my part in this, as one of the first people to realize that the arguments to the grep() function were passed by reference, and that this was really cool and deserved to be blessed. I think I was also the first to try modifying $a and $b in a sort function, which was stupid, but made sense at the time. By far the worst, however, from the point of view of clarity, was Perl poetry. All those pesky quotes around string literals were distracting, you see, so they were made optional.

This is still the case, and while religious use of use strict; will protect you from most of them, there are places where unquoted string literals are completely unambiguous, and darn convenient as well. Specifically, when an unquoted string literal appears in list context followed by the syntactic sugar “=>” [ex: (foo => “bar”)], and when it appears in scalar context surrounded by braces [ex: $x{foo}].

TextWrangler and BBEdit are blissfully unaware of these “bareword” string literals, and make no attempt to syntax-color them. I think that’s a reasonable behavior, whether deliberate or accidental, but it has one unpleasant side-effect: interpreting barewords as operators.

Here’s the stripped-down example I sent them, hand-colored to match TextWrangler’s incorrect parsing:

#!/usr/bin/perl

use strict;

my %foo;
$foo{a} = 1;
$foo{x} = 0;

my %bar = (y=>1,z=>1,x=>1);

$foo{y} = f1() + f2() + f3();

sub f1 {return 0}
sub f2 {return 1}

sub f3 {return 2}

Automating PDF cleanup with Acrobat and AppleScript


As I mentioned earlier, I’m generating lots of PDF files that don’t work in Preview.app, and are also a tad on the large side. Resolving this problem requires the use of Adobe Acrobat and Acrobat Distiller. Automating this solution requires AppleScript. AppleScript is evil.

Just in case anyone else wants to do something like this from the command line, here’s what I ended up with, which is run as “osascript pdfcleaner.scpt myfile.pdf”:

on run argv
	set input to POSIX file ((system attribute "PWD") & "/" & (item 1 of argv))
	set output to replace_chars(input as string, ".pdf", ".ps")
	
	tell application "Adobe Acrobat 7.0 Standard"
		activate
		open alias input
		save the first document to file output using PostScript Conversion
		close all docs saving no
	end tell
	
	tell application "Acrobat Distiller 7.0"
		Distill sourcePath POSIX path of output
	end tell
	
	set nullCh to ASCII character 0
	set nullFourCharCode to nullCh & nullCh & nullCh & nullCh
	tell application "Finder"
		set file type of input to nullFourCharCode
		set creator type of input to nullFourCharCode
	end tell
	
	tell application "Terminal"
		activate
	end tell
end run
	
on replace_chars(this_text, search_string, replacement_string)
	set AppleScript's text item delimiters to the search_string
	set the item_list to every text item of this_text
	set AppleScript's text item delimiters to the replacement_string
	set this_text to the item_list as string
	set AppleScript's text item delimiters to ""
	return this_text
end replace_chars

[I wiped out the file type and creator code to make sure that the resulting PDFs opened by default with Preview.app, not Acrobat; I swiped that code from Daring Fireball. The string-replace function came from Apple’s AppleScript sample site.]

"What do you do with a B6 notebook?"


(note: for some reason, my brain keeps trying to replace the last two words in the subject with “drunken sailor”; can’t imagine why)

Kyokuto makes some very nice notebooks. Sturdy covers in leather or plastic, convenient size, and nicely formatted refill pages. I found them at MaiDo Stationery, but Kinokuniya carries some of them as well. I like the B6 size best for portability; B5 is more of an office/classroom size, and A5 just seems to be both too big and too small. B6 is also the size that Kodansha publishes all their Japanese reference books in, including my kanji dictionary, which is a nice bonus.

[This is, by the way, the Japanese B6 size rather than the rarely-used ISO B-series. When Japan adopted the ISO paper standard, the B-series looked just a wee bit too small, so they redefined it to have 50% larger area than the corresponding A-series size. Wikipedia has the gory details.]

I really like the layout of Kyokuto’s refill paper. So much so, in fact, that I used PDF::API2::Lite to clone it. See? The script is a little rough at the moment, mostly because it also does 5mm grid paper, 20x20 tategaki report paper, and B8/3 flashcards, and I’m currently adding kanji practice grids with the characters printed in gray in my Kyoukasho-tai font. I’ll post it later after it’s cleaned up.

Why, yes, I was stuck in the office today watching a server upgrade run. However did you guess?

On a related note, am I the only person in the world who thinks that it’s silly to spend $25+ on one of those gaudy throwaway “journals” that are pretty much the only thing you can find in book and stationery stores these days? Leather/wood/fancy cover, magnet/strap/sticks to hold it shut, handmade/decorated (possibly even scented) papers, etc, etc. No doubt the folks who buy these things also carry a fountain pen with which to engrave their profound thoughts upon the page.

Or just to help them impress other posers.

Customizing for Usability, Bad Haiku Edition


I’m doing 45 minutes of cardio (most) every day on my LifeFitness 5500 elliptical cross-trainer. Doctor’s orders. I like working out on this machine, and it’s certainly good for me, but I’ve always had a problem occupying my mind. In the past, I’ve simply listened to music on my iPod, generally a PopTarts mix (or, more recently, JPopTarts). Studying kanji and vocabulary for my Japanese class would be an ideal use of this time, but I never ordered the optional magazine stand, and it doesn’t look like they make it any more.

So, I stopped at an office supply store and bought the only non-ridiculous copy-holder they sold. Just setting it on top of the crosstrainer worked fairly well, but hid the display. I really needed it to sit above the display section, but there was no obvious way to accomplish this feat. And then, a moment of clarity:

How to attach this...
What mounting system will work?
Ah! Some gaffer tape!

Short Review: Nisus Writer Express


If your (Mac-only) word-processing needs fit within Writer Express‘s feature list, the generally sensible UI will make it a superior alternative to Word. Within its limitations, it’s an excellent, useable program.

However, if you need table support that’s better than an ancient version of Netscape, real Word interoperability, or precision layout tools, look elsewhere. For now, at least; they’re working hard to improve the product.

Note to people with fond memories of the Mac OS Classic Nisus Writer: Express implements a subset of the old features, along with a bunch of new ones.

“Need a clue, take a clue,
 got a clue, leave a clue”