LIES, DAMNED LIES AND DATA
America has the lowest COVID-19 death rate in the world. Just ask Trump. He’s got the numbers to prove it. They also have one of the highest rates. Just ask anyone else. They have the numbers to prove that too. It’s all true. It’s all lies. It’s all in the data.
But that’s just America, right? Or is it? Statistics New Zealand just published June quarter employment figures that show unemployment is dropping here. They took pains to point out that the headline number didn’t really reflect the actual situation. Then they published it anyway. Because that’s what the numbers said.
Data is a lot like a library.
There’s stacks of great stuff in the library. You can find a book on just about anything. You can also find great stories, interesting insights and a whole bunch of other stuff you’ll never want to look at. There are fact-based journals of science and history – and a stack of other facts that debunk them. Your data is just as useful and contradictory as any local library.
Data can tell you what you want to hear.
The great thing about data is that people believe it. In fact, 72% of people will instantly believe a made up number. That’s not to say that data itself is made up. It rarely is. It’s just that data by itself is little more than dots on a page.
It’s not until you join those dots to draw a picture that data becomes useful. But only humans (or robots trained by humans) can usefully extract insight through the dot-joining process. And humans are inclined toward answers they want to hear.
The confounding variable of time.
The other big challenge of data is shelf life. A database is only ever a record of yesterday’s dots. Even as we scrape information from the internet in real time – it’s still only digital exhaust. The great thing about people is that we’re creatures of habit. So yesterday’s behaviour is a pretty good predictor of tomorrow. But the confounding variable is always time – what happened between then and now?
An obvious example is those headlines from Statistics New Zealand. If the June quarter employment numbers are empirically measured just like the March numbers, then they have to be accurate, right? Wrong. Something happened in the middle that tipped the table over and rendered them rubbish.
Data is gold. But we need to craft it.
As more and more people look to data to drive decision making, it’s more and more important to remember the humans behind the dots.
Transactional data can tell us what people have done. Demographic data can tell us what they look like. And various digital toys can scrape activities that usefully describe how they did it. But only a human (or a human-built robot) can get us even close to any answers on ‘why’.
DOn’T BELIEVE THE DATA IF IT DOESn’T MAKE SENSE.
So next time you’re talking about data, don’t believe the headlines if they don’t add up. Dig below the surface to find the story in the data – and always be alert for confounding variables. Or, to put it in the words of a guy who knew a lot about data analysis before it was even invented:
“Where is the wisdom we’ve lost in knowledge, where is the knowledge we’ve lost in information.”
– T.S. Eliot
He was a poet, a playwright, and a fan of practical cats – and Eliot understood the importance of the human to drive insight. Data can be amazing to help us find answers. But the data is never the answer in itself.
That’s what I reckon, what do you think?