# Those statistics are lying to you (maybe)

Darrell Huff first published *How to Lie with Statistics* in 1954, in which he explained how statistics were employed to intentionally deceive the inattentive consumer of information. This wasn’t a new idea; several of Huff’s predecessors had arrived at the same conclusion:

“There are three kinds of lies: lies, damned lies, and statistics.”

– attributed to British Prime Minister Benjamin Disraeli (1804-1881)

“Facts are stubborn things, but statistics are pliable.”

– Mark Twain (1835-1910)

[Some individuals] use statistics as a drunk man uses lamp-posts — for support rather than for illumination.

– Andrew Lang (1844-1912), Scottish writer and anthropologist

Some were even more skeptical of the use of statistics (sorry, statisticians):

“If your experiment needs a statistician, you need a better experiment.”

– Ernest Rutherford (1871-1937), best known for establishing the planetary model of the atom

## With that in mind, let’s look at some of the lessons from Huff’s book by picking out forms of misleading statistics:

### Biased samples (also called ascertainment bias or systematic bias)

Samples can be biased in various ways, either intentionally or unintentionally, but all “systematically favor some outcomes over others.”

### Biased averages

Huff describes the three averages: mean, median, and mode. With a normally distributed (think “bell curve”) sample, these three averages are similar. But get into irregularly distributed samples and they can vary significantly.

### Misleading visualizations

By skewing elements of graphs and charts, like the scale, you can paint a picture very different from the underlying truth. Look at these graphs below, and you’ll spot what’s wrong.

### “Significant findings” and throwing out data

Statistical significance is determined by hypothesis testing and numeric calculation. However, “insignificant findings” are often conflated (perhaps intentionally) with “didn’t support my stance.”

### Semi-attached figures

In the words of Huff: “If you can’t prove what you want to prove, demonstrate something else and pretend that they are the same thing.”

### Correlation vs. Causation

Just because things are correlated, doesn’t necessarily mean one causes the other. Look __here__ for some fun examples.

## So, how can you avoid falling victim to the statistical charlatans? Start with these questions:

- Who is making the claim?
- How do they know/where is the evidence?
- Are we provided the whole picture?
- Has someone changed the topic? Are we still discussing the original question?
- Does the argument make sense?

Now you, too, can fight the good fight against misleading statistics!