Thoroughly Random Blogging

More than usual, I mean. I’ve been playing with the static site generator Hugo as a way to move this blog and its comments out of Movable Type.

After clearing the initial hurdle of incomplete and inconsistent Open Source documentation (pro tip: if a project starts numbering versions from 0.1 instead of 1.0, it’s safe to assume that there’s no tech writer on the team), the next step is adding a theme to render your site. There’s no default theme, and half a dozen different recommended ones of varying complexity and compatibility. Short version: I’m not sure Hugo currently has layout functionality equivalent to Movable Type 2.x from 2003, much less any of the modern tools; it might, it’s just that hard to find out.

There’s some support for basic pagination, something that’s always been missing here (and which is partially responsible for the long delay when adding comments), but the built-in paginator includes a link for every page, which is pretty painful when you have 200+ pages. If I get the time, I’ll have to dust off my Go and send them a patch to make it behave sensibly with large numbers.

Rendering all ~3,800 entries (counting quotes and sidebar microblogs) and ~3,500 comments takes about 12 seconds on my laptop, but that’s still too long for iterative testing, and the OS open-file limit makes it impossible to test with the live-rebuild feature of the built-in web server.

So I wrote a quick Bash script to retrieve N random articles from Wikipedia and format them the way Hugo expects, as Markdown with TOML metadata. Why Bash? Because the official Wikipedia API for efficiently retrieving articles and their metadata using generators and continues is either broken or incomprehensible to me, since I spent two hours at it and got a never-ending list of complete and partial articles. So I just looped over the “” URL and piped the output through Pandoc. Rather than pulling in the real metadata, I just generate dates and categories in Bash. Now I can quickly generate a small site with multiple sections and simple categorization, and it’s trivial to add more features like series, tags, authors, etc. [in fact, I did!]

(relevant only to Hugo users after the jump…)

There’s no direct support for comments, but I found a rather amusing way to render them. I created a content section named ‘_comment’, whose metadata contains a taxonomy tag named ‘comments’ listing the ID of the associated entry. Creating zero-length files named layouts/_comment/single.html and layouts/section/_comment.html completely suppresses page-generation for this section, so that it can only be reached through the taxonomy. The layouts/taxonomy/comment.html file overrides the default descending-date sort, so the comments appear in order, and the main blog entries use the ‘id’ tag from their metadata to generate a link to the taxonomy page: /comments/$id.

You get warnings about generating zero-length files, but the output is clean.

Now, actually adding new comments is something I haven’t worked on yet; it’s a static site generator, after all, and dynamic content has to get added with Javascript or CGI. Isso looks like the least painful option, although it has a lot more complexity than I really want. Using a small, secure CGI script that carefully sanitized inputs and saved the comments into my ‘_comment’ section as unpublished drafts would work, and allow simple moderation and maintenance. Not a high priority right now; if I switch over, I can just leave comments off for a week or so.

[side note: my Javascript url-rewriter has held up remarkably well for stopping blog-spamming scripts; people who POST to the URL that’s actually in the HTML files not only fail, but get their IP address instantly blocked by my firewall. To cut down on false positives, the Submit button is also disabled unless Javascript is active. :-)]

The big chore right now is getting the templates and CSS styling working so I can rebuild something similar to the current blog’s design, but mobile-compatible. Nothing I’ve tried in the Hugo Themes Gallery is a good match, partially because they target different versions of the rapidly-changing spec, but mostly because their assumptions don’t map well to my 14-year-old blog, in either features or scale.