Welcome to the first non-trivial update to this blog since 2003. Things are still in flux, but I’m officially retiring the old co-lo WebEngine server in favor of Amazon EC2. After running continuously for fourteen years, its 500MHz Pentium III (with 256MB of RAM and a giant 80GB disk!) can take a well-deserved rest.
The blog is a complete replacement as well, going from MovableType 2.64 to Hugo 0.19, with ‘responsive’ layout by Bootstrap 3.3.7. A few Perl scripts converted the export format over and cleaned it up. LetsEncrypt allowed me to move everything to SSL, which breaks a few graphics, mostly really old Youtube embeds, but cleanup can be done incrementally as I trip over them.
Comments don’t work right now, because Hugo is a static site generator. I’ve worked out how I want to do it (no, not Disqus), but it might be a week or so before it’s in place. All the old comments are linked correctly, at least.
Do I recommend Hugo? TL/DR: Not today.
Getting out of the co-lo has been on my to-do list for years, but I never got around to it, for two basic reasons:
I was hung up on the idea of upgrading to newer blogging software.
I didn’t feel like running the email server any more, and didn’t like the hosting packages that were compatible with MT and other non-PHP blogging tools.
In the end, I went with G-Suite (“Google Apps for Work”) for $5/month. Unlike the hundreds of vendor-specific email addresses I maintain at jgreely.com, I’ve only ever used one here, and all the other people who used to have accounts moved on during W’s first term.
Next up, working comments!
Actually, next turned out to be getting the top-quote to update
randomly. The old method was a cron job that used wget
to log into
the MT admin page and request an index rebuild, which, given the tiny
little CPU, had gotten rather slow over the years, so it only ran
every 15 minutes.
The site is now published by running hugo
on my laptop and rsyncing
the output, it’s not feasible or sensible to update the quotes by
rebuilding the entire site. So I wrote a tiny Perl script that regexes
the current quotes out of all the top-level pages for the site,
shuffles them, and reinserts them into those pages. It takes about
half a second.
Since there are ~350 pages, there will be decent variety even if I don’t post for a few days and regenerate the set. If I wanted to get fancy, I could parse the full quotes page and shuffle that into the indexes, guaranteeing a different quote on each page (as long as the number of quotes exceeds the number of pages, which means I can add about 800 blog entries before I need to add more quotes. :-)
The machine this site runs on hasn’t been updated in a while. The OS is old, but it’s OpenBSD, so it’s still secure. Ditto for Movable Type; I’m running an old, stable version that has some quirks, but hasn’t needed much maintenance. I don’t even get any comment spam, thanks to a few simple tricks.
There are some warts, though. Rebuild times are getting a bit long, my templates are a bit quirky, and Unicode support is just plain flaky, both in the old version of Perl and in the MT scripts. This also bleeds over into the offline posting tool I use, Ecto, which occasionally gets confused by MT and converts kanji into garbage.
Fixing all of that on the old OS would be harder than just upgrading to the latest version of OpenBSD. That’s a project that requires a large chunk of uninterrupted time, and we’re building up to a big holiday season at work, so “not right now”.
I need an occasional diversion from work and Japanese practice, though, and redesigning this blog on a spare machine will do nicely. I can also move all of my Mason apps over, and take advantage of the improved Unicode support in modern Perl to do something interesting. (more on that later)
Someone finally got around to automating a comment-spamming tool that evaded my trivial protections (rename MT CGI scripts, force preview before post). Naturally, they decided to send six different comments to three or four different articles, about a dozen times each.
Sadly for them, they put their web site into the commenter’s URL field, which I don’t display, so their efforts were in vain. Even worse, from their point of view, they sent them all from the same IP address, which meant it took about thirty seconds to clean things up. And another five to ban their entire netblock at the firewall. I didn’t even need to rebuild, since the comment pages aren’t cached (another trivial change from the defaults).
I think for the next pass, I’ll change the comment URL from /mt/hasturhasturhastur to /murfle/gleep. The best defense against automation is diversity.
Hmmm, looks like updating OpenBSD may have broken MT posting through Ecto.
…
Ah, I think it’s just a version mismatch in the chroot environment.
…
Sigh, that solved most of it, but not all. It looks like I’m going to have to reinstall a bunch of Perl modules, and then rsync them into the chroot.
…
No, wait, it seems Ecto allowed me to insert an invisible character into a blog entry, that it subsequently refused to translate into something that could be uploaded via XML-RPC. Blech.
[clarification: thanks to its NeXT roots, the standard OS X text widget supports a limited subset of Emacs editing keys. Unfortunately, while it lets you use Control-Q to insert literal ASCII characters, it doesn’t know how to display all of them. While typing my mini-review of the Forerunner 201, I somehow managed to type Control-Q Control-N, and Movable Type’s XML-RPC interface coughed up a giant furball when Ecto sent it this unescaped control character.]
Update: The response from Ecto support is “Should be fixed in next version.” Cool.
Update: And, indeed, it’s now fixed.
Dear child,
You’re not clever, you’re not funny, you’re certainly not my friend, and you have nothing interesting to say. Stop spamming my comments.
And, by the way, it took me about five seconds to wipe out your latest “contributions”, so you’re not even a real annoyance, just a bug on the windshield.
Oh, and if anyone else reading this wants a good laugh, it took this wannabe-troll three hours to come up with 18 lame comments. All wiped out with one line of SQL code and a quick rebuild.
(oops, miscounted the first time; I counted all the POST events, forgetting the mandatory preview I turned on a while back. I had to go by the logs, since I’d already nuked the actual comments. :-))
(which, by the way, is done with: delete from mt_comment where comment_ip = “nnn.nnn.nnn.nnn” and now() - comment_created_on < 1;)
(oh, and for more amusement, I’ve added each of his IP addresses to my badlife PF blacklist, so he can’t even see the site until he reconnects and gets another one. If he keeps it up, I’ll just block the entire subnet for a while; it’s not like I have so many readers that I actually care about the loss of a few random Class C networks for a few days.)
(and if he ever did rise to the level of an actual annoyance, my badlife system can trivially be extended to automatically add his IP addresses to the blacklist without human intervention; I did most of the work a long time ago to deal with mass downloads of my picture site…)
Not only is it painfully obvious when you come by to try to bump the page-rank for German credit “repair” agencies by manually spamming my comments, but it’s pointless, because my comment pages don’t show the MTCommentURL field.
If you put the URL in the body, it will actually work (for a few more minutes, at least), but then it will be even more obvious what sort of cretinous lowlife you are, and make it even easier to delete your spam.
Why don’t you go join that asshat who tried to kill himself by eating everything on the menu at McDonald’s for thirty days? You’ve already got the public vomiting down pat, so the weight gain and failing health should be a snap.
My contribution to warding off comment spam: reduce its value to the spammers by breaking their URLs. The blog owner (and trusted friends) can keep their URLs intact by adding a password to their comments.
This doesn’t stop someone from flooding your blog with spam; it’s just a lightweight filter to eliminate the benefit. pornospam.com won’t get hits or page-rank from a URL that’s been rewritten to pornospam-DOT-com.
I was extremely surprised not to find this in MT’s built-in tags. MTEntryCategory exists, to tell you what the name of a blog entry’s primary category is, but there was no function to provide a link to the matching category archive page.
Well, now there is.