more discussion about tail size

Once again men are heatedly discussing tail size. Just ponder this queston: How large is the long tail? Personally, I’m going to keep looking before I decide for myself.

While it’s novel to bring mathematical precision to such matters, unfortunately it seems to me that this mathematical model focuses attention on misleading features. The model says that the share of the k most popular items is log(k)/log(n), where n is the total number of items on offer. Thus, in this model, the total number of items on offer determines the share of the most popular items.

This isn’t a sensible model. Mathematically, a power law describes an infinite number of items on offer. The slope of the power law, or more precisely, the slope of an approximating power law at the high popularity end of the distribution, usually describes well the high-end shares. The question is what determines the slope of the power law. The number of items on offer isn’t a good answer to that question, particularly for n varying from two million to six billion.

For a concrete example, consider the popularity of the ten-most-popular given names. The set of possible given names (given names on offer) is huge, and probably hasn’t changed much in the past two-hundred years. However, the popularity of the ten-most-popular given names for males in England has fallen from about 85% in 1800 to about 28% in 1994. If you want to understand changes in the popularity of the most popular items in a collection of symbols instantiated and used in a similar way, try to understand this change.

* * *

For additional amusement, here’s a post I stuck in the galbithink.org newsfeed a little more than a year ago, back in the time of Web-Pleistocene:

Tail aficionados might enjoy pondering the distinguishing features of the long tail. I think that size, which tail authorities have categorized as long or short, matters less than shape. It should be no surprise to anyone that shape can change over time. For some graphical evidence, see the detailed images here.

So don’t just sit around complaining that “diversity plus freedom of choice creates inequality”. Power laws don’t imply any particular amount of inequality. The power of the powerlaw determines the difference between tails. Look at some examples and see for yourself!

6 thoughts on “more discussion about tail size”

  1. Douglas,

    As it happens, I gave a speech at Google over the weekend on this exact question. Sometimes a “drooping tail” is due to some removeable ineffeciency in the martket, such as poor findability ot limited distribution, and sometimes it’s actually the natural shape of demand in that marketplace. The difference between powerlaws, which are “heavy tailed” distributions and “lognormal” curves, which aren’t is a fascinating area of research, and I discuss it in this post.

  2. Sorry. interrupted your normal conversation with your friends.
    Did you write about the child labor in England? King’s College,
    Cambridge. June ’94.?
    If you did, I really got alot from it.
    Thank you.
    Sandra.

Leave a Reply

Your email address will not be published. Required fields are marked *

Current month [email protected] day *