Business intelligence visualization guru Stephen Few wrote an interesting analysis of a Malcolm Gladwell talk he attended during a SAS Institute conference. Key idea:
Our former problems were usually solved by digging up and revealing the right information. He used Watergate as an example, pointing out that key information was hidden, and the problem was solved when Washington Post journalists Woodward and Bernstein were finally able to uncover this information that had been concealed. Modern problems, on the other hand, are not the result of missing or hidden information, Gladwell argued, but the result, in a sense, of too much information and the complicated challenge of understanding it. Enron was his primary example. The information about Enron’s practices was not kept secret. In fact, it was published in several years’ worth of financial reports to the SEC, totaling millions of pages. The facts that led to Enron’s rapid implosion were there for anyone who was interested to see, freely available on the Internet, but weren’t understood until a journalist spent two months reading and struggling to make sense of Enron’s earnings, which led him to discover that they existed only as contracts to buy energy at a particular price in the future, not as actual cash in the bank. The problems that we face today, both big ones in society like the current health care debate and smaller ones like strategic business decisions, do not exist because we lack information, but because we don’t understand it. They can be solved only by developing skills and tools to make sense of information that is often complex. In other words, the major obstacle to solving modern problems isn’t the lack of information, solved by acquiring it, but the lack of understanding, solved by analytics.
For Austin’s public sector, there are three related problems.
First, there is no notion that data is a public good that Austin’s citizens are entitled to have. Access to data is presently a discretionary privilege granted in the event that a request clears freedom of information requirements and/or bureaucratic obstacles. This level of protection makes sense when it comes to individualized records, but it does not make sense for the types of data sets that are useful for most policy discussions (think water demand in the WTP4 debate.)
Second, because there isn’t an affirmative expectation that data sets should to be gathered and put out to the public, many opportunities to assemble useful data sets are not explored. What factors predict a good cop or a water main bursting? Data on these can be gathered, put out in the public in a way that does not compromise individual privacy and allows for crowd-sourced crunching and accountability. There is no expectation and thus, there are no systems in place to create these data products. These are just as essential as the budget (which if you think about it, is also a data product) and need to start being considered basic elements of local government transparency just like an annual, line-by-line budget is a basic expectation.
Third, our overall innumeracy as a society gets in the way of understanding the nuances revealed by data analysis. I am thinking for example of the controversy about Austin Energy’s renewable push and how it is absent any concept of volatility or dissection into the rate of annualized increases.
The first two problems are easier to solve, but perhaps if there is more data and associated analysis out there that affects daily lives, then the median voter will start getting more interested in things like standard deviations. Just think about the number of statistical terms used in sports, and you get a sense that non-quants can get into numbers if they see the link to insight they care about.
Thanks for posting this. I’m amazed at the rate at which barriers to accessing information are coming down. We’re trying to do our part here at CAPCOG with projects like GeoMap (free aerial imagery), Data Points, etc., but there is a lot more work ahead of us. I think the public sector, generally, could improve on this, but when you watch the U.S. Census Bureau fight for its budget year after year, for example, those of us who believe strongly in the production of public use data can get discouraged. As expectations about data availability grow, hopefully we’ll see more political support for investing in data programs.
Brian,
I think you are correct that those of us that care about data need to do a better job of organizing for the allocation of resources to prioritize the creation and release of data sets. I’d love to hear from your experiences at CAPCOG with Data Points about what type of data people want, what has worked to spur engagement, and what those of us in civil society can do to support more raw data being gathered and put out to the public. And btw, do you have a sense of who the local champions on this topic are?
There is a small (and growing) cadre of folks in the public and non-profit sectors in the Austin area who are passionate about data availability and why it’s important for public engagement and decision-making. I’m not sure they’d appreciate me calling them out here on your blog, but they are not hard to find if you look around for people publishing data sets, reports, etc. on popular issues. Top of the list here in terms of frequently asked products include GIS data layers, aerial imagery, economic forecast data, and labor market information. I’ve seen a tremendous increase in the frequency of requests, as well as the types of organizations making those requests, over my four years here at CAPCOG. If there’s interest out there among your readers, I’d be glad to talk to you about putting together a webinar or some other small forum to share what we’ve learned here.