# Gaussianitis: a subtle (and nearly) universal disease

April 5, 2008

Gaussianitis: compulsive disorder characterised by a subject’s compulsive use of ‘Normal’ statistics in order to get away with the complexity and ambiguity of life

How does Gaussianitis work? Let me give you a couple of examples

The interview with Nick Clegg (the LibDem leader) in GQ Magazine has stimulated a flurry of articles on sexual partner number. Is 30 normal for a 40 year old man? Should I worry if my Casanova index is stuck at 5? Is my Don Giovanni parameter abnormal if I am at 100? Well, what does it mean to be normal in sexual life anyway? Now, this is an interesting question!

How do we assess normality in sexual behaviour? Well, one way is to collect a sample of the population, ask them about their sexual preferences, add up the answer and divide by the total number of respondents. We get the mean and then polish it up by getting rid of the outliers. Then we compare it with our number and decide where we stand. Majority rules. Right? Wrong. It is a symptom of Gaussianitis at work, and, I’m sorry, it affects nearly everyone.

What’s wrong? Well, the error is in the assumption again. That the majority of individuals will be similar to each other. In a famous paper published in Nature in 2001, the Swedish sociologist Liljeros discovered that the distribution of sexual contacts is power law distributed, that is, it shows a long and fat tail with most individuals having only a few contacts and a few of them having lots. A power law distribution shows no significant mean or variance because the extreme events in the tail of the distribution (the Casanovas, Don Giovannis or Patient Zero Gaëtan Dugas) shift the mean and variance. And as the extreme events in the tail are much more common than one would expect, mean and variance never converge. Or, more simply, mean and variance don’t exist. What does it all mean? Simply that the variability of the phenomenon is so large that there is no convergence toward any representative value. The representative value doesn’t exist. As my friend Jack Cohen points out: “it’s like taking an average between a man and woman. What do you get? A person with one breast, one testicle, one ovary, half a penis”. What is that representative of?

Let me give you another example. Robert Axtell is an economist at the George Mason’s Center for Social Complexity. He gave a fascinating talk this year at the Organization Science Winter Conference. He asked the ‘simple’ question: what is a representative US firm in terms of employees? How do you find out? Well, you get a good database, you add all the employees of US firms and then divide by the number of firms. You get something like, say, 20 (I don’t exactly remember the number). Now this number is important if you are in charge of setting the regulatory frameworks for businesses and you want to make life better for the majority of firms. You assume that 20 is the best number to start from. Well it turns out that the distribution of firm sizes in the US (and in most countries all over the world) decreases in a power law fashion. No mean, no variance, no representative firm. The most common firm, the mode in fact, has zero employees. Yes zero, only the owner manager, no employees. And then, where do you start to set your regulatory frameworks for say SMEs?

Jack Cohen, again, says that we humans (but it extends to firms and planets) are like hand-made pieces, made by a artisan, not standardised mass manufactured items. If this is true, and if you talk to him (or read his books) he can show you some quite convincing evidence, then the idea of simplifying the true complexity of life by assuming representative averages when none exist is a dangerous illusion. So next time you revert to an average to make a point, ask yourself if it is Gaussianitis!

