The Pareto Principle

You see the pattern everywhere: The top one percent of the population controls 35 % of the wealth. On Twitter the top 2 percent of users sends 60% of the messages. In the health care system, the treatment of the most expensive fifth of patients creates four fifths of the overall cost.

This pattern was so common that Pareto called it a “predictable imbalance”. Despite this bit of century-old optimism, however, we are still failing to predict it, even though it is everywhere.

Part of our failure to expect the expected is that we have been taught that the paradigmatic distribution of large systems is the Gaussian distribution, commonly known as the bell curve. In a bell curve distribution – like height, say, – the average and the median, the middle point in the system, are the same. The average height of a hundred American women selected at random will be about 5’4″ and the height of the fiftieth ranked woman will also be 5’4″.

Pareto distributions are nothing like that. The recursive 80/20 weighting means that the average is far from the middle. This in turn means that in such systems, most people, or whatever is being measured, are below average, a pattern encapsulated in the old economics joke: “Bill Gates walks into a bar and makes everybody a millionaire, on average”.

The Pareto distribution shows up in a remarkably wide array of complex systems. Together, “the” and “of” account for 10% of all words used in English. The most volatile day in the history of the stock market will typically be twice as volatile as that of the second most volatile and ten times the tenth most. Tag frequency on Flickr photos obeys a Pareto distribution, as does the magnitude of earthquakes, the popularity of books, the size of asteroids, and the social connectedness of your friends. The Pareto principle is so basic to the sciences that special graph paper showing Pareto distributions as straight lines rather than as steep curves is manufactured by the ream.

And yet, despite a century of scientific familiarity, samples drawn from Pareto distributions are routinely presented to the public as anomalies, which presents us from thinking clearly about the world. We should stop thinking that average family income and the income of the median family have anything to do with one another, or that enthusiastic and normal users of communications tools are doing similar things, or that extroverts should be only moderately more connected than normal people. We should stop thinking that the largest future earthquake or market panic will be as large as the largest historical one. The longer the system exists the likelier it is that an event twice as large as all previous ones is coming.

This doesn’t mean that such distributions are beyond our ability to affect them. A Pareto curves decline from head to tail can be more or less dramatic, and in some cases, political or social intervention can affect that slope. Tax policy can raise or lower the share of income of the top 1 percent of a population, just as there are ways to constrain the overall volatility of markets, or reduce the band in which health-care costs can fluctuate.

However, until we assume such systems are Pareto distributions and will remain so even after any such intervention, we haven’t even started thinking about them in the right way. In all likelihood, we’re trying to put a Pareto peg in a Gaussian hole. A hundred years after the discovery of this predictable imbalance, we should finish the job and actually start inspecting it.

Clay Shirkey
from This Will Make You Smarter
edited by John Brockman