Wednesday, October 06, 2004

The Long Tail

Forget squeezing millions from a few megahits at the top of the charts. The future of entertainment is in the millions of niche markets at the shallow end of the bitstream.

When working on collaborative filtering systems, we’ve long wondered how much of the action was in the low volume items in the “tail” of the Zipf distribution that these things always seem to fall into. For some items, such as movie-theater movies, there is almost no tail, since low-popularity movies just don’t get made or distributed. For others, such as books, it’s clear that there are a lot of members of the tail, but we never knew how much mass was in there. An article in the latest Wired magazine (I waited until now to share it, since they waited a while before putting it online) posits that the tail for many items is big and profitable, and that recommender systems (including collaborative filtering systems and human recommendations) are pushing more and more traffic to the broad base of less-popular items. Consider this:


The average Barnes & Noble carries 130,000 titles. Yet more than half of Amazon's book sales come from outside its top 130,000 titles. Consider the implication: If the Amazon statistics are any guide, the market for books that are not even sold in the average bookstore is larger than the market for those that are (see "Anatomy of the Long Tail").

In addition, many times these unpopular items are more profitable, for example when the customer buys a music track from an album from decades ago. There are a lot more interesting tidbits in the article, and some implications for automatic recommender system design, since it’s tougher to make good recommendations about the more sparse data, which may also need different parameter settings for best performance then those that give best results for the very popular items. It gets even more interesting should it turn out that you want to bias your results towards the less popular items. One more conclusion: this seems to indicate there is even more of a value in being the highest-traffic site (as in ebay and amazon) – only if you have enough traffic in the tail can you make quality recommendations about the data in the tail, driving more traffic and profits, enabling you to be more competitive on price…

Link:
http://www.wired.com/wired/archive/12.10/tail.html

Oh, there also was a Slashdot discussion about it yesterday, with typical low S/N ratio: http://slashdot.org/article.pl?sid=04/10/05/185236&tid=188&tid=187

0 Comments:

Post a Comment

<< Home