Web

Fun with Hugo themes


One of the challenges with Hugo is that, out of the box, it doesn’t do anything. Create a site, fill it with content, run the generator, and you get… nothing. You need to download or create a theme in order to actually render your content; there isn’t one built into the site-creator, although several volunteers are working on something (much the same way that usable documentation is largely a volunteer effort).

It is not immediately obvious that the theme gallery is sorted by update date, so that the farther down the list you go, the less likely they are to work. There’s a top-level set of feature tags, but they’re applied by the theme authors, and don’t include useful things like “scales beyond 100 pages”.

As part of my ongoing MasterCook molesting, I decided to take the now-sane XML files and render them to Hugo’s mix of TOML and Markdown, generating a static cookbook site with sections and categories. Having done some experimentation in response to a forum post, I knew that a site with 56,842 pages would take several minutes to build, so I grabbed the simple, clean Zen theme and fired it off.

And waited. And waited. And watched the memory usage climb to over 40GB of compressed pages.

The Hugo developers pride themselves on rendering speed, but when I checked the disk, it was taking upwards of a second to render a single content page. Looking at one of them made it obvious why: the theme designer included every content page in the dropdown menus and sidebar. It had honestly never occurred to him that someone might have more than about 8 categories with about 20 pages each. In fairness, this is a port of a Drupal theme, and the original might have had the same problem.

After modifying the templates to only use the first 20 from each category, I got the site to render in about 10 minutes. The category menu looks horrible, because I split the recipes up alphabetically into chunks of about a thousand, and the theme only allocated enough space for about 2/3 of them, with the rest covering the title field. The actual recipe rendering is excellent, including the handling of sub-recipes and referenced recipes.

I could modify the Zen theme until it did everything right, or spend several hours rebuilding a small sample site with other themes until I found one that required less work, but once you’ve built one theme from scratch, it’s just faster and easier to do that than to try to use any of the pre-built themes. Their real value is as examples of “how do you do this in Hugo”, which you can’t generally find in the documentation.

There are also quite a few working code snippets in the forums (some provided by me; problem-solving is kinda my thing, if you haven’t guessed by now), but with so much of the code under active development, any forum example more than a few months old is likely to be wrong now.

It’ll be a while before I bring the cookbook back up, since this is definitely a copious-free-time project, and not only do I have to knock together a theme and set up search (most likely Xapian Omega again, since I’m fresh on it), but also molest the recipe data and impose some consistency on categorization, tagging, and ingredient naming. Currently it has 782 distinct categories, many of which differ by only a few characters, and about 2/3 of them should really be tags instead. All of these issues should really be fixed in the MX2 files, so that they can be cleanly imported back into MasterCook, but since that’s not XML, the scripting is a little more “interesting”.

Tentatively, I’m going to start with my blog theme, since I’ve already tested it at scale (and learned that large taxonomies are a significant bottleneck). I can strip out a lot of the blog-specific stuff without much effort, I’ve already done the work to switch over to dropdown menus for categorization, so the only real trick will be embedding any referenced recipes in a hidden DIV at the bottom of each page, and setting up a print-only stylesheet that hides the nav and exposes the embedded recipes. The references are already turned into links to the appropriate recipe’s page, thanks to the builtin relref shortcode.

Do you even Internet?


There was a time when I used to feel like I was cheating, somehow, getting paid to do things that were easy and obvious. But I kept running into people who Just Didn’t Think Right. I never developed the common “people who aren’t like me must be stupid” problem, thanks in part to dealing with a lot of secretaries who could do all sorts of things that I couldn’t, but even today, I sometimes get the urge to reach through the Internet, grab someone by the collar, and shout “but it’s right there!​”.

For instance, Scandalous Gaijin collects some quite pleasant cosplay photos, but mixed in with the cheesecake recently was a shot of a nicely old-fashioned street in Japan, with kimono-clad women in the foreground and a pagoda in the background, with the comment “Anyone knows the name of this street in kyoto? I need to check it out”. Now, if you don’t read kanji, you might not guess that the second half of the caption “もう一度 八坂の塔” is the name of the place, but when you look at the picture you can clearly read the names of two stores, “Happy Pie” and “Happy Bicycle”, and typing “happy bicycle kyoto” into Google Maps takes you to the exact spot the picture was taken from. (and if you didn’t know it was Kyoto, cut-and-pasting 八坂の塔 into Google will tell you that)

Google Image Search gets overlooked a lot, too; it would be nice if they’d sort the results chronologically, so you don’t have to search through the largest images by hand until you find something close to the original source, but generally it will at least give you some information, and often the full context it was originally posted in. The answer to “who is this goddess and where can I find more pictures of her” is generally pretty easy to find.

Even just plain Google searches seem to elude otherwise intelligent people. Just today, I’ve had two extremely intelligent, skilled co-workers email me detailed error messages and ask how to fix the problem. I paste the error into google, and *poof*, the answer emerges. At least with these two I know they can figure it out, and they’re just outsourcing their problem-solving to me so they can get back to fixing other broken things, but a lot of times it’s from people who are honestly stumped by something that could be resolved with ten seconds of cut-and-paste.

Now, the people who email me screenshots of detailed error messages, they’re beyond help…

Pictures and spoilers and smartypants


When the world was young, and this “blogging” thing was new, I maintained my site by hand, typing new content into index.html as I thought of it. Then I spent a great deal of time customizing MovableType to suit my needs, and used it for the next 14 years.

One of the common plugins was SmartyPants, which turned scruffy old typewriter quotes into pretty curved ones. As a long-time type nerd, of course I had to use it. The MT implementation was pretty good, and only rarely guessed wrong about open quotes. The one used by Hugo is, unfortunately, always wrong in a specific case that I use quite often: quotations that start with an ellipsis. For those, I’ve had to go through the archives and manually insert the Unicode zero-width space character ​ after the opening quote.

I never used MT’s web form for posting content, because, like so many other people have discovered, it’s too easy to lose an hour of work with a single mis-click or fumble-finger. Ecto was a great tool until it just stopped working one day (long after it stopped being supported), with only one quirk: at random intervals it would lose track of the UTF-8 encoding, and post garbage instead of kanji. A refresh would always fix the problem, so it was just a minor annoyance.

When it stopped working, I switched to MarsEdit, which is an excellent tool, and if I could easily connect it to Hugo, I would. As it is, I’ve gone back to running Emacs in a terminal window, with Perl/Bash scripts and Makefiles wrapped around an assortment of command-line tools.

For images, I insist on supplying proper height and width attributes so that the browser can layout the page properly while waiting for the download. Hugo can automatically insert those for pictures stored locally, but I upload them all to an S3 bucket with s3cmd, so I run them all through ImageMagick’s convert for cleanup and resizing, then Guetzli for JPEG conversion, and embed them with this shortcode:

{{ $link := (.Get "link" | default (.Get "href"))}}
{{ $me := . }}
<div align="center" style="padding:12pt">
  {{if $link}}
    <a href="{{$link}}">
  {{end}}
  <img
    {{ range (split "src width height class title alt" " ") }}
      {{ if $me.Get . }}
        {{. | safeHTMLAttr}}="{{$me.Get .}}"
      {{end}}
    {{end}}
  >
  {{if $link}}
    </a>
  {{end}}
</div>

None of the arguments are mandatory (even src, without which there’s not much point), but it will add any of the listed ones if you’ve supplied them, and allow you to add a link with either “link” or “href”. This can be embedded in the new spoiler shortcode I wrote yesterday (which relies on Bootstrap’s collapse.js):

{{ $id := substr (md5 .Inner) 0 16 }}
{{ $label := (.Get 0 | default "view/hide") }}
{{ $class := (index .Params 1 | default "") }}
<div class="collapse {{$class}}" id="collapse{{$id}}">{{ .Inner }}</div>
<p><a role="button" class="btn btn-default btn-sm"
  data-toggle="collapse" href="#collapse{{$id}}"
  aria-expanded="false" aria-controls="collapse">{{$label}}</a>
</p>

The results look like this, and yes, the picture behind the NSFW tag is NSFW:

{{< spoiler NSFW >}}
{{< blogpic 
  src="https://dotclue.s3.amazonaws.com/img/tumblr_o3wrl58ICr1rlk3g8o1_1280.jpg"
  width="560" height="420"
  class="img-rounded img-responsive"
>}}
{{< /spoiler >}}
...well, Not Safe For Waterfowl, anyway...

It took about 30 seconds to convert my Gelbooru mass-posting script to generate shortcodes instead of HTML, so my most-recent cheesecake post was done this way. Now that I have the NSFW shortcode, I’ll likely include some racier images in the next one…

At some point I’ll pull out all my scripts and customizations into a demo blog on Github, so that I have something to point to when someone asks how to do something that is either not directly supported in Hugo (like monthly archive pages), or is just poorly documented (“damn near everything”).

The New Perl Way: “You’re doing it wrong”


use CGI;

“I’m sorry, Dave, I can’t do that.”

cpanm CGI

“You really shouldn’t use that any more. It’s bad for you.”

perldoc CGI

“The rationale for this decision is that CGI.pm is no longer considered good practice for developing web applications, including quick prototyping and small web scripts. There are far better, cleaner, quicker, easier, safer, more scalable, more extensible, more modern alternatives available at this point in time. These will be documented with CGI::Alternatives.”

perldoc CGI::Alternatives

No documentation found for “CGI::Alternatives”.

cpanm CGI::Alternatives
perldoc CGI::Alternatives

“Let me build this strawman that doesn’t actually make good use of CGI.pm to show you how you can easily switch to one of half a dozen different frameworks that let you use half a dozen different templating systems launched with half a dozen different embedded web servers, and replace your self-contained 100-line CGI script with half a dozen files located in half a dozen directories. For more fun, my sample code gets mangled if you try to view it as a manpage, so you really should download the raw file from CPAN.”

cpanm --uninstall CGI::Alternatives
cpanm Dancer2
perldoc Dancer2
cpanm --uninstall Dancer2
cpanm Mojolicious
perldoc Mojolicious
perldoc Mojolicious::Lite

use Mojolicious::Lite;
plugin CGI => [ '/' => "trivialscript.cgi" ];
app->start;

use CGI;

more...

Legacy


After Steven Den Beste died, some of the (many!) people who were concerned about the loss of his old web sites reached out to the family to try to recover the data from his server. I was pulled in because I was physically closest when it seemed like we might need someone to go to Portland to pick up the machine.

That wasn’t necessary, but since I was the one exchanging email with his brother, I was the one who ended up with a shiny little thumb drive containing the old Chizumatic site, and between that and the Wayback Machine, managed to synthesize a complete, functional website.

I packaged it all up, sent it to my not-so-secret allies, and then… nothing. This is not a criticism or complaint; everybody’s busy, and after that one energetic weekend, I hadn’t done anything about it, either.

But now I have a brand new virtual server at Amazon, where bandwidth is silly-cheap and disk space ain’t no big deal. And I’d already figured out the Nginx config to get the old server-side includes working.

So, this may not be the official permanent home of Steven’s old web sites, but it is a home, for a welcome houseguest.

(via)

Probationary Comment System: Isso


It felt lonely in here, so I got Isso working for comments. Easy to nuke-and-pave if I don’t like it, at least. The whole “Python virtualenv” experience was a real pain in the ass, though, since pip install repeatedly claimed to have installed all the dependencies, while pip list called bullshit on that.

I’ll probably have to put Monit on the server in case it crashes, but that can wait.

Update

It’s possible to have Isso dynamically update the comment count in the article metadata block, but I just spent about an hour failing to get it to work, between Isso’s and Hugo’s overlapping limitations.

On the Isso side, you can either show the comment form or add counts to a page. They’re conflicting JavaScript includes, according to the docs. I could write my own bit of jquery to make an ajax call to retrieve the count and insert it into the page, but I thought that would be more work.

Until I ran into Hugo’s variable-scoping. When you render content in a list context, you’re really fully rendering each page in its own context and then including the results. So, inside a template, variables like $.Kind and $.URL refer to the individual article’s context, as if you were currently writing out that one article to disk. And of the two completely different ways you can set variables, one of them is strictly block-scoped, and the other is strictly page-scoped. You can’t pass either down into a partial template.

(there’s a partial-with-arguments called a shortcode, but that’s a completely different beast, and I’m not sure it is either effective or efficient to replace all your partials with shortcodes) UPDATE: completely impossible, in fact; shortcodes don’t work in template files, and partial templates don’t work in content files. They’re completely different things with completely different behaviors. You have to construct a custom dictionary and pass it into a partial template, which is butt-ugly and error-prone.

So, yeah, no comment-count on the home page at the moment.

4/12 Update

I wrote my own bitty Jquery function to use Isso’s API directly and insert the comment count on page-load. It would be nice if the API returned “0” instead of 404 errors when there aren’t any comments, though.

Jacking up the license plates…


…and changing the car.

Welcome to the first non-trivial update to this blog since 2003. Things are still in flux, but I’m officially retiring the old co-lo WebEngine server in favor of Amazon EC2. After running continuously for fourteen years, its 500MHz Pentium III (with 256MB of RAM and a giant 80GB disk!) can take a well-deserved rest.

The blog is a complete replacement as well, going from MovableType 2.64 to Hugo 0.19, with ‘responsive’ layout by Bootstrap 3.3.7. A few Perl scripts converted the export format over and cleaned it up. LetsEncrypt allowed me to move everything to SSL, which breaks a few graphics, mostly really old Youtube embeds, but cleanup can be done incrementally as I trip over them.

Comments don’t work right now, because Hugo is a static site generator. I’ve worked out how I want to do it (no, not Disqus), but it might be a week or so before it’s in place. All the old comments are linked correctly, at least.

Do I recommend Hugo? TL/DR: Not today.

Getting out of the co-lo has been on my to-do list for years, but I never got around to it, for two basic reasons:

  1. I was hung up on the idea of upgrading to newer blogging software.

  2. I didn’t feel like running the email server any more, and didn’t like the hosting packages that were compatible with MT and other non-PHP blogging tools.

In the end, I went with G-Suite (“Google Apps for Work”) for $5/month. Unlike the hundreds of vendor-specific email addresses I maintain at jgreely.com, I’ve only ever used one here, and all the other people who used to have accounts moved on during W’s first term.

Next up, working comments!

Update

Actually, next turned out to be getting the top-quote to update randomly. The old method was a cron job that used wget to log into the MT admin page and request an index rebuild, which, given the tiny little CPU, had gotten rather slow over the years, so it only ran every 15 minutes.

The site is now published by running hugo on my laptop and rsyncing the output, it’s not feasible or sensible to update the quotes by rebuilding the entire site. So I wrote a tiny Perl script that regexes the current quotes out of all the top-level pages for the site, shuffles them, and reinserts them into those pages. It takes about half a second.

Since there are ~350 pages, there will be decent variety even if I don’t post for a few days and regenerate the set. If I wanted to get fancy, I could parse the full quotes page and shuffle that into the indexes, guaranteeing a different quote on each page (as long as the number of quotes exceeds the number of pages, which means I can add about 800 blog entries before I need to add more quotes. :-)

Thoroughly Random Blogging


More than usual, I mean. I’ve been playing with the static site generator Hugo as a way to move this blog and its comments out of Movable Type.

After clearing the initial hurdle of incomplete and inconsistent Open Source documentation (pro tip: if a project starts numbering versions from 0.1 instead of 1.0, it’s safe to assume that there’s no tech writer on the team), the next step is adding a theme to render your site. There’s no default theme, and half a dozen different recommended ones of varying complexity and compatibility. Short version: I’m not sure Hugo currently has layout functionality equivalent to Movable Type 2.x from 2003, much less any of the modern tools; it might, it’s just that hard to find out.

There’s some support for basic pagination, something that’s always been missing here (and which is partially responsible for the long delay when adding comments), but the built-in paginator includes a link for every page, which is pretty painful when you have 200+ pages. If I get the time, I’ll have to dust off my Go and send them a patch to make it behave sensibly with large numbers.

Rendering all ~3,800 entries (counting quotes and sidebar microblogs) and ~3,500 comments takes about 12 seconds on my laptop, but that’s still too long for iterative testing, and the OS open-file limit makes it impossible to test with the live-rebuild feature of the built-in web server.

So I wrote a quick Bash script to retrieve N random articles from Wikipedia and format them the way Hugo expects, as Markdown with TOML metadata. Why Bash? Because the official Wikipedia API for efficiently retrieving articles and their metadata using generators and continues is either broken or incomprehensible to me, since I spent two hours at it and got a never-ending list of complete and partial articles. So I just looped over the “https://en.wikipedia.org/wiki/Special:Random" URL and piped the output through Pandoc. Rather than pulling in the real metadata, I just generate dates and categories in Bash. Now I can quickly generate a small site with multiple sections and simple categorization, and it’s trivial to add more features like series, tags, authors, etc. [in fact, I did!]

(relevant only to Hugo users after the jump…)

more...

“Need a clue, take a clue,
 got a clue, leave a clue”