Wednesday, April 7 2004

iTunes Music Recommendation service

This is a new “if you like X, try Y” service set up as a student project at University of Illinois (Champaign-Urbana). Does it work better than Goombah? Dunno yet; so far I haven’t been able to get it to work. I can upload my iTunes database, but it fails trying to download recommendations (probably due to being Farked, Slashdotted, BoingBoinged, Lileksed, Instapunted, or some other combination of high-profile links).

I can think of two reasons why it’s a better bet, though: first, it looks like they’re doing the work on the server side, rather than chewing up hours of CPU time on your computer, and second, Goombah hasn’t updated their client or database in months.

Ah, just got through, and discovered one disadvantage to server-side processing:

Your music database is being processed. This window will show your recommendations once they’ve been computed.
Notice: The server is a little backed up, hence the long wait. Once the server gets caught up the wait will be ALOT shorter, until then I would recommend that you don’t hit the resend button.
Your estimated wait for results is 8 hours, 44 minutes, 40 seconds. You may quit and log back in at anytime to check on the status of your recommendations.

Update: The perils of popularity. Five hours later, it now says my recommendations will be ready in 13 hours, 30 minutes. They seem to have plenty of bandwidth, so maybe this would be a good time for Apple to donate a few Xserves to the cause. Xgrid would be ideal for distributing this application across a cluster.

Update: Okay, ten hours after submitting my iTunes database, it now says it will be 17 hours, 42 minutes (which apparently means 17 total, not 17 more…) before my recommendations are ready. A message on their forums suggests that their fuzzy-match algorithm is choking on the wide variety of bad data (from CDDB, MusicBrainz, and the whimsical transcription of the users) being introduced into their database. Oops.

Update: 44.5 hours after submitting my iTunes database, the (revised, much clearer) estimate is now 5.5 hours remaining. From the discussion on their forums, it looks like they’re processing each request serially, and updating the estimates each time a request completes. Unfortunately, as the database grows, each request takes longer and longer to process, so the estimate becomes progressively worse.

I think the students involved just learned an important lesson about scalability. I wonder if their grade depends on how they apply this lesson. :-)