If David Brooks is correct, the “rising philosophy of the day” is “data-ism.” But you don’t have to believe David Brooks. Just look at the big data (e.g. Google Trends) on “big data.” For the political junkies, data became sexy in 2012. First, the New York Times’ Nate Silver’s meta-analyses of polling data triumphed over the pundits’ “gut feelings.” Second, the Obama campaign successfully used data analytics to increase voter turnout. This caused people to pay attention (witness, for example, David Brooks’ new devotion to the subject as prime column-fodder). Of course, for those of us in the transparency and accountability advocacy community, data has long been a prized commodity. And as governments around the world increasingly commit to open data promises, more and more data is becoming available. At its best, data allows us to transcend our personal anecdotal experiences, giving us the big picture. It allows us to detect relationships and patterns that we wouldn’t otherwise see. Using data smartly can help us to make better decisions about both our own lives and our society. But it’s important to understand that data and data analysis are merely tools. They can be used well, or they can be used poorly. It is remarkably easy both to mislead and to be misled by data. Hence the old adage: “There are three kinds of lies: lies, damned lies, and statistics.” For many people, data can quickly overwhelm and confuse. It’s easy to misinterpret data, or to use it irresponsibly. We as humans are not particularly good at intuitively grasping large numbers, and our educational system generally does a poor job of helping us to counter this problem. For that reason, I want to offer two basic principles that I think could prevent a majority of the data mistakes that I observe:
- Cherry-picking works better with fruit than data
- Correlation provokes questions better than it answers them