The Million Song Dataset (MSD) team is proud to partner with Last.fm to announce a new complementary dataset: the Last.fm dataset. It contains song-level tags and song-to-song similarity. And it's big (i.e. BIG)! A few numbers:
http://labrosa.ee.columbia.edu/millionsong/lastfm
* 943,347 matched tracks MSD <-> Last.fm
* 505,216 tracks with at least one tag
* 584,897 tracks with at least one similar track
* 522,366 unique tags
* 8,598,630 (track - tag) pairs
* 56,506,688 (track - similar track) pairs
We thank Last.fm (http://www.last.fm/) for making this data available, it is the largest addition to the MSD so far. We are convinced that its impact on music information retrieval will be considerable.
As always, we appreciate any feedback! For instance, my favorite tag so far is "Acid Smurfs". A few additional notes on the MSD:
- we are working on some additional data regarding collaborative filtering, more on this at ISMIR
- we turned the CAL500 and CAL10K datasets into MSD format (http://bit.ly/oyBCwQ)
- please consider attending our tutorial at ISMIR (http://bit.ly/pSwlEA)
Happy swimming in data!
Thierry Bertin-Mahieux
Million Song Dataset team
http://labrosa.ee.columbia.edu/millionsong/
Attachment:
MM09.pdf
Description: Adobe PDF document
Attachment:
MM10.pdf
Description: Adobe PDF document