Wednesday, September 10 2003

iTMS weekly reports

No, I didn’t buy another big batch of music from the iTunes Music Store yet, although I probably will soon, to stock up the iPod for my next road trip to Las Vegas. I have been keeping an eye on the store, though, and after corresponding with Brian Tiemann, I decided to investigate an oddity we’d both noticed: the week-by-week “Just Added” report ain’t no such thing.

The report allegedly shows the list of albums that have been added in the past four weeks. At first glance, it looks impressive, but after a few viewings, we noticed that the list of new albums for a particular week changes retroactively. That is, last week this week wasn’t this week last week.

How different? I was curious enough to grab the report directly from the server and parse the XML. The report produced on 9/2 claimed the following numbers for the four most recent weeks: 179, 81, 260, 84. On 9/9, it was: 70, 157, 82, and 260.

At first glance, it looks like 22 albums that were updated on 9/2 were updated again on 9/9, but that’s not the case. Comparing the unique album IDs, I found that 36 albums moved from 9/2 to 9/9, and 16 albums that weren’t present in the 9/2 report were added to it after the fact. Two more that were “just added” in last week’s data are no longer available in the store. There was also one album added retroactively to the week of 8/26, but it’s otherwise unchanged.

Unfortunately, I accidentally overwrote last week’s raw XML file, so I don’t know what those two now-missing albums were called; I only have their unique IDs. It’s possible that they were changed so significantly that they were given completely new database records. I’ll follow up on this next week, since I was careful to keep the raw data this time.

On a side note, it’s difficult to parse the XML file generated by iTMS, because it doesn’t contain any semantic markup; it’s all physical layout, so there was no point in using a real XML parser instead of hacked-up regular expressions in Perl. There’s something revolting about “looking for an object with styleSet=normal12 that starts with a bold-faced month name”.

Anyway, the short version is that the “just added” report is actually a “recently modified” report, and doesn’t necessarily reflect any actual increase in the selection. I’d need a much longer baseline to determine how many new albums are really being added from week to week, and a lot more detailed data to figure out what is actually being changed in existing albums. Corrected typos? Rescanned album covers? Re-ripped tracks? Partial albums completed?