Please, crash for me

I have never been more annoyed at an application’s failure to fail.

We have this service daemon that performs various actions on incoming images. Recently, it’s been crashing at random intervals, leaving behind a core file that tells us precisely what function it segfaulted in, but includes nothing to tell us where the image came from. All we know is that somewhere out on the Internet, there are JPEG images that crash our copy of the IJG JPEG library in jpeg_idct_ifast().

Since this was affecting customer performance, we really wanted to know, so we cranked up the logging on one of the affected thirty-two servers, to capture the incoming request URLs. And it hasn’t crashed since.

Four days of crashes every hour or so, and now nothing. The good news is that our customers are less unhappy. The bad news is that our developers don’t have a test case to code a fix against.

So now I’m trolling the web, looking for corrupt JPEGs. I strongly suspect that the images that caused our problem were intended to exploit holes in a certain other OS, but I can’t be sure until I find some and feed them to our server. Sigh.