Bathtime Buddies…

No, not the kind with cheesecake. Last month, Wonderduck stumbled across an onsen-themed set of rubber-duck capsule toys (which reminds me, “no, Amazon won’t ship it internationally, but I have a reshipping agent that I use, and I have some other stuff that needs handled that way as well; I just need to update my account with them, because we’re moving our office”).

In the comments, I linked to one of the many bucket-o-duckies products on Amazon Japan. Here’s what it looks like when someone puts them to good use:


Dubious capsule toys

Honestly, I’m not sure I’d want to put 5 bucks into this machine. エロ過ぎる = ero-sugiru = “too sexy”.


Now, if it dispensed pokemon balls containing horny monster girls… oh, wait, that’s an anime plot.

Duck Soap

A little something we stumbled across while heading for the Terry Fator show at The Mirage.

Great show, by the way.

And if you want some really good Italian food in Vegas, go to Nora’s.

Reasons to have an OpenBSD router at home, Amazon Wand Edition

Since the new Amazon Dash Wand is effectively free for Prime customers, and it gives you a home-automation controller, bar-code scanner, and a hand-held Alexa device that is not always listening, I ordered one.

When it arrived this morning, I followed the instructions, opened the Amazon app on my iPhone, and went through the setup process. Wifi Fail. Wifi Fail. Wifi Fail. “You should contact customer service”.

The first 20+-minute call went through a bunch of cookbook questions about who my Internet provider was, and how to change the channel on my router. I had a brief flashback to the Seventies, then realized their script assumed Comcast meant “all-in-one cable modem, router, and wireless access point”. I played along, knowing this would make no difference, and the call eventually ended in an RMA.

I was curious to see if it really was a wireless problem, so I logged into the OpenBSD router, checked the DHCP logs, and found an entry for a new Amazon MAC address. I fired up tcpdump and went through the setup again, and sure enough, the device got DHCP, connected to the Internet for DNS, connected to an Amazon server, and then started trying to talk to a public (non-Amazon) NTP server to set its date and time.

It failed every time. Annoyingly, it wasn’t even looking in DNS for its NTP server; the addresses were hardcoded in either the build or the config it had downloaded.

So, armed with the knowledge that the hardware was fine, I tried to get back through to customer service with this knowledge. An hour later, after two different people tried to debug phone app, wireless and bluetooth problems (including telling me to turn on GPS on my phone!), I finally got someone to twiddle the right bits so it could connect to servers that were up, and then cancel the RMA.

Now I have a Dash Wand. Ho, ho, ho.

Corpus Fun

I’m pretty sure “futanari” is not Dutch. Also “gmail”, “iphone”, “http”, “cialis”, and “jackalope”. “bewerkstelligen”, on the other hand, fits right in.

For my new random word generator, I’ve been supplementing and replacing the small language samples from Chris Pound’s site. The old ones do a pretty good job, but the new generator has built-in caching of the parsed source files, so it’s possible to use much larger samples, which gives a broader range of language-flavored words. 5,000 distinct words seems to be the perfect size for most languages.

Project Gutenberg has material in a few non-English languages, and it’s easy to grab an entire chapter of a book. Early Indo-European Online has some terrific samples, most of them easily extracted. But what looked like a gold mine was Deltacorpus: 107 different languages, all extracted with the same software and tagged for part-of-speech. And the range of languages is terrific: Korean, Yiddish, Serbian, Afrikaans, Frisian, Low Saxon, Swedish, Catalan, Haitian Creole, Irish, Kurdish, Nepali, Uzbek, Mongol, etc, each with around 900,000 terms. The PoS-tagging even made it easy to strip out things that were not native words, and generate a decent-sized random subset.

Then I tried them out in the generator, and started to see anomolies: “jpg” is not generally found in a natural language, getting a plausible Japanese name out of a Finnish data set is highly unlikely, etc. There were a number of oddballs like this, even in languages that I had to run through a romanizer, like Korean and Yiddish.

So I opened up the corpus files and started searching through them, and found a lot of things like this:

437 바로가기    PROPN   
438 =   PUNCT   
439 http    VERB    
440 :   PUNCT   
441 /   PUNCT   
442 /   PUNCT   
443 www NOUN    
444 .   PUNCT   
445 shoop   NOUN    
446 .   PUNCT   
447 co  NOUN    
448 .   PUNCT   
449 kr  INTJ    
450 /   PUNCT   
451 shop    PROPN   
452 /   PUNCT   
453 goods   NOUN    
454 /   PUNCT   
455 goods_list  NOUN    
456 .   PUNCT   
457 php NOUN    
458 ?   DET 
459 category    NOUN    
460 =   PUNCT   
461 001014  NUM 

1   우리의  ADP 
2   예제에서    NOUN    
3   content X   
4   div에   NOUN    
5   float   VERB    
6   :   PUNCT   
7   left    VERB    
8   ;   PUNCT   

Their corpus-extraction script was treating HTML as plain text, and the pages they chose to scan included gaming forums and technology review sites. Eventually I might knock together a script to decruft the original sources, but for now I’m just excluding the obvious ones and skimming through the output looking for words that don’t belong. This is generally pretty easy, because most of them are obvious in a sorted list:


Missing some of them isn’t a big problem, because the generator uses weighted-random selection for each token, and if a start token only appears once, it won’t be selected often, and there are few possible transitions. Still worth cleaning up, since they become more likely when you mix multiple language sources together.

Camelcide, Effectomy,and Memetical

How to tell that your new random-word generator works: feed it the text of the Jargon File and get back things that you could easily come up with real jargon definitions for…


Second impression

I decided to see how much detail I could get with the 20-degree v-bit. This was carved out of the end grain of a generic hardwood 1-inch square dowel.

Update: I shined a flashlight onto the seal to get a better look at it.

First impressions

This is a 2-inch square seal created in Illustrator with my script, imported to VCarve Desktop, and cut from a mounted linoleum block on my Nomad with a 20-degree v-bit. The hardest part about getting a good firm impression on paper is that my stamp pad is too darn small; I’ll have to buy an uninked pad at an office supply shop and load it up with the good pigment.

Zettai Ryouiki Seal of Approval

Also making a good impression is glamour model Meru Tsujimura, who cannot be found at office supply shops…