Global android activations and the power law

Posted on 03/07/2011 by

0



I recently saw a video mapping the rates of global android activations.  While very nice looking, it also piqued my interest because I noticed something that has been kind of bothering me recently.  That thing is, of course*, the power law.

Alright, so first the video. It comes right from the android developer’s youtube channel, and it shows the activations of new android phones since it’s debut in October 2008 to today.  A couple things to notice are the releases of the Motorola Droid in late 2009 and Samsung Galaxy S in Europe in the summer of 2010, as well as the concentration of activations in urban areas.

(See some coverage here)

Now here’s where this gets interesting.  Notice how the adoption rates just kind of percolate for a while until—boom—something happens (like the release of a new phone).  I have been reading some interesting literature coming out of various research groups on social complexity, especially from the prolific Barabási lab.  What is relevant here is the infamous heavy tail.

(The heavy tail)

What is a heavy/long tail?  While the term is often used interchangeably with an application of the power law, it is fundamentally talking about a distribution describing a relationship of events, where the frequency of an event varies as a power of another attribute of the event.  Researchers sometimes note that it has ‘memory’ because one event is going to be influenced in a dynamic way by another.  This takes the form of a small number of very frequent observations (the peak) with a really long tail indicating a large number of low frequency events.  Think about it like this: if you take the number of hits on every website available from a google search in a single day and plot them according to their frequency, the curve is going to show a few very large numbers (like wikipedia), and lots of small ones (like this blog).  Plotting this on a log-log scale provides a straight line sloping from left to right downwards.  Now this sounds kind of obvious at first—not every website is going to be equally popular.  Yet this all suddenly becomes a little crazy when you remember back to our friend the normal curve.

(Our friend, the normal curve)

Recall that if you take a large enough sample of adult humans and measure their heights, most observations will fall between around 5-6 feet with few less than about 4 feet and few more than 7.  It looks like a single peak with two tails, where mean and standard deviation can describe the distance of an individual from the norm.

 

I remember in a statistics class a professor mentioned how we shouldn’t try to look for normal distributions everywhere because we were bound to find them anywhere if we looked hard enough.  As a side note, this seems like all the more reason to look for it, but that’s beyond the point.  Although the normal curve is still more or less ubiquitous, a large cadre of researchers, mostly physicists, are applying the power law to just about anything they can get their hands on.  It has been used to describe the outbreak of epidemics, mass migrations, websites and blogs, twitter, as well as many, many things in physics, engineering, and computer science.

(Power law at work with facebook apps)

(and, yes, with online gaming too)

This observation isn’t all entirely new, its just the way of describing it shakes things up.  In the 19th century a number of political economists, quite worried about rapid urbanization, made the apt observation that most of the world is almost completely empty, while we see a small number of areas that are extremely densely populated.  These enlightenment fellows just didn’t have the mathematical and computational tools necessary to adequately describe their observations.  But either way, this happens to be a classical application of the power law, just as Pareto’s infamous 80/20 rule which suggests that 20% of the people on earth own 80% of the wealth.  Ignoring some of the problems with this idea that others have pointed out, I think you get the idea that this is a really powerful concept.

Now where this really gets interesting (edit: as someone just suggested, we may not all find this equally interesting, but you’ll just have to soldier on if you want to get to the cool visualizations) is when we start looking back to our android adoption video.  When creating a model that explains this trend, most social scientists would actually start with a Poisson process, which assumes that in some time interval an individual (let’s say its me) acts with a probability equal to the frequency of the activity (say getting a new phone) multiplied by the time interval.  In other words, it predicts that the time interval between two events (getting a new android phone and then having to buy another one) follows an exponential distribution.  Taking a look at one of the seminal researchers in Power laws in social sciences, Albert-László Barabási, we see that this is not so useful in many situations.  As the staggering amount of work his lab has produced indicates (see here), often times our behavior tends to work on non-Poisson distributions with bursts of activity.  As he writes:

The differences between Poisson and heavy-tailed behaviour are striking: a Poisson distribution decreases exponentially, forcing the consecutive events to follow each other at relatively regular time intervals and forbidding very long waiting times. In contrast, the slowly decaying, heavy-tailed processes allow for very long periods of inactivity that separate bursts of intensive activity (Barabási 2005).

(From Barabási 2005.  The upper image is a Poisson distribution, the lower is a power law and exhibits burstiness)

I think you can see where I’m going with this.  It is not a particularly new suggestion that consumption practices follow a power law; however, this offers a good example of the concept in action.  Look at the video again, this time paying attention to bursts of activity (edit: as I recently learned, this is called ‘burstiness,’ and there is an actual algorithm for measuring it); you can see exactly what Barabási describes in the image above.

This begs the question as to what other events follow the power law.  As others (Paxson et al. 1996; Kleban et al. 2003; Eubank et al. 2004) have shown, you can find it virtually everywhere.  What I think we now need to question is not whether it is there (it is) but how and why it tends to crop up in the way it does.  Physicists working with properties of gases have been inquiring into the why of the power law for a while now, and it has payed off with new understandings of the properties of matter in general and the application of quantum mechanics in Newtonian situations.

Like social network analysis and other currently trendy topics in computational social sciences, we are doing a pretty alright job describing phenomena, and increasingly doing a good job at visualizing and presenting it.  Now we need to start delving into the basis of these emergent properties through cultural, structural, and psychological/psychosocial functions.  I think that visualization will play a significant role in this development because we need to understand a pattern in order to break it down and qualify it.  That is why Barabási’s charts here are so important; they show us a pattern at work that we can immediately generalize to other events (like android activations) just using the pattern recognition powers of our brains.

A.-L. Barabási. The origin of bursts and heavy tails in human dynamics. Nature 435 207–211 (2005).

A. Vázquez, J. G. Oliveira, Z. Dezsö, K.-I. Goh, I. Kondor & A.-L. Barabási. Modeling bursts and heavy tails in human dynamics. Physical Review E 73, 036127 (2006).

Eubank, H. et al. Controlling epidemics in realistic urban social networks. Nature 429, 180–184 (2004).

Harder, U. & Paczuski, M. Correlated dynamics in human printing behavior. Preprint at khttp://xxx.lanl.gov/abs/cs.PF/0412027l (2004).

Kleban, S. D. & Clearwater, S. H. Hierarchical Dynamics, Interarrival Times and Performance. Proc. SC2003 khttp://www.sc-conference.org/sc2003/paperpdfs/pap222.pdfl (2003).

Paxson, V. & Floyd, S. Wide-area traffic: The failure of Poisson modeling. IEEE/ACM Trans. Netw. 3, 226 (1996).

*alright, maybe this is not so obvious.  It’s just that I have been reading the fantastic sort of historical pseudo-biography of the early innovators of complexity science by Mitchell Waldrop (on Amazon.com here).  Although it is a little dated these days, especially with the rapid pace of development in computational methods, I highly recommend it any layman reader like me.

Advertisements