The ‘average’ revolutionized scientific research, but overreliance on it has led to discrimination and harm

When analyzing a set of data, one of the first steps many people take is to calculate an average. You can compare your height to the average height of the people where you live, or brag about the batting average of your favorite baseball player. But while the average can help you study a data set, it has important limitations.

Using the average that ignores these limitations has led to serious problems such as discrimination, injury and even life-threatening accidents.

For example, the US Air Force designed its aircraft for ‘the average man’, but abandoned this practice when pilots could no longer fly their planes. The average has many uses, but it says nothing about the variability in a data set.

I am a discipline-specific educational researcher, which means that I conduct research into how people learn, with a focus on technology. My research includes investigating how engineers use averages in their work.

Using the mean to summarize data

The mean has been around for a long time and its use was documented as early as the ninth or eighth century BC. In an early case, the Greek poet Homer estimated the number of soldiers on ships by taking an average.

Early astronomers wanted to predict future locations of stars. But to make these predictions, they first needed accurate measurements of the stars’ current positions. Several astronomers carried out position measurements independently of each other, but they often arrived at different values. Because a star has only one true position, these discrepancies posed a problem.

Galileo was the first to push for a systematic approach to addressing these measurement differences in 1632. His analysis was the beginning of error theory. Error theory helps scientists reduce uncertainty in their measurements.

Error theory and the mean

According to error theory, researchers interpret a series of measurements as falling around a true value that is corrupted by errors. In astronomy, a star has a real location, but early astronomers may have had unsteady hands, blurry telescope views and bad weather – all sources of error.

To deal with errors, researchers often assume that measurements are unbiased. In statistics, this means that they are evenly distributed around a central value. Unbiased measurements still contain errors, but they can be combined to better estimate the true value.

Suppose three scientists each made three measurements. Viewed individually, their measurements may seem random, but when unbiased measurements are put together, they distribute evenly around a middle value: the mean.

When the measurements are unbiased, the average will tend to be in the middle of all measurements. We can even show mathematically that the average is the closest to all possible measurements. For this reason, the average is an excellent tool for dealing with measurement error.

Statistical thinking

Error theory was considered revolutionary at the time. Other scientists admired the precision of astronomy and tried to bring the same approach to their field. The 19th century scientist Adolphe Quetelet applied ideas from error theory to study humans and introduced the idea of ​​taking averages of human heights and weights.

The average helps make comparisons between groups. For example, taking averages from a data set of male and female heights can show that the men in the data set are, on average, taller than the women. However, the average does not tell us everything. In the same data set we could probably find individual females that are larger than individual males.

So you can’t just look at the average. You also need to consider the distribution of values ​​by thinking statistically. Statistical thinking is defined as carefully considering variation – or the tendency of measured values ​​to be different.

For example, different astronomers taking measurements of the same star and recording different positions is an example of variation. The astronomers had to think carefully about where their variation came from. Since a star has one true position, they could safely assume that their variation was due to an error.

Taking the average of measurements makes sense when variation arises from sources of error. But researchers must be careful when interpreting the average when there is real variation. For example, in the height example, individual women may be taller than individual men, even though men are taller on average. Focusing only on the average overlooks the variation, which has led to serious problems.

Quetelet did not only adopt the practice of calculating averages from error theory. He also made the assumption of a single true value. He elevated the ideal of “the average man” and suggested that human variability was fundamentally flawed – that is, not ideal. According to Quetelet, there is something wrong with you if you are not exactly of average height.

Researchers who study social norms note that Quetelet’s ideas about “the average man” contributed to the modern meaning of the word “normal”: normal height and normal behavior.

These ideas have been used by some, such as early statisticians, to divide populations into two: people who are superior in some way and people who are inferior.

For example, the eugenics movement – ​​a despicable attempt to prevent ‘inferior’ people from having children – traces its thinking to these ideas about ‘normal’ people.

Although Quetelet’s idea of ​​variation as error supports practices of discrimination, Quetelet’s use of the average also has direct links to modern technical errors.

Failures of the mean

In the 1950s, the US Air Force designed its aircraft for ‘the average man’. It was assumed that an aircraft designed for average height, average arm length, and average along several other important dimensions would work for most pilots.

This decision contributed to as many as seventeen pilots crashing in one day. Although ‘the average man’ could fly the plane perfectly, real variation stood in the way. A shorter pilot would have difficulty seeing, while a pilot with longer arms and legs would have to flatten himself to fit.

Although the Air Force assumed that most pilots would be near average on all important dimensions, it found that of the 4,063 pilots, none were average.

The Air Force solved the problem by designing for variation: it designed reclining seats to account for real-world variation among pilots.

While reclining seats may seem obvious now, this “average person” thinking still causes problems today. In the US, women are about 50% more likely to be seriously injured in car accidents.

The Government Accountability Office blames this disparity on crash testing practices, which roughly represent female passengers using a scaled-down version of a male dummy, much like the Air Force’s “average male.” The first female crash test dummy was introduced in 2022 and has yet to be adopted in the US

The average is useful, but has limitations. For estimating true values ​​or making comparisons between groups, the average is powerful. However, for individuals who exhibit real variability, the average simply doesn’t mean much.

This article is republished from The Conversation, an independent nonprofit organization providing facts and trusted analysis to help you understand our complex world. It was written by: Zachary del Rosario, Olin College of Technology

Read more:

Zachary del Rosario receives funding from the National Science Foundation and has collaborated with Citrine Informatics and Toyota Research Institute.

Leave a Comment