XML is Not Enough


David Robinson, associate director of Princeton’s Center for Information Technology Policy, has an interesting post at the center’s Freedom to Tinker blog about the best way government should present data. David proposes that government should release its information in a form nobody wants to read via XML files that are  “machine-readable” but are largely indecipherable to the human eye. It would be up to journalists, activist organizations and individuals to decipher and present the data in ways citizens can understand. This would spawn the creativity that would allow “a thousand mashups bloom,” he argues.

Government releasing data in XML format, in many cases, would be a step in the right direction. No question about that. One of the great maxims of Web 2.0 is that when it comes to data and information, content is king. The act of making data available opens up all sorts of possibilities of sharing, remixing and the like. But why should government stop there? Why shouldn’t government agencies make an effort to make their data more easily understood by the average citizen? David is proposing a false argument, I believe. Who is advocating that government should be the “only source for interaction” with its data? Why either/or? Why not both?

There have been some very clever displays of government data. Take ProPublica’s interactive graph of where all the money is going in the proposed stimulus bill  published earlier this week, for example. Another is Sunlight’s own Capitol Words, where, for every day Congress is in session, Capitol Words visualizes the most frequently used words in the Congressional Record, giving you an at-a-glance view of which issues lawmakers address on a daily, weekly, monthly and yearly basis.

Andrew Rasiej, Personal Democracy Forum founder and Sunlight senior technology adviser, has said government should put the Sunlight Foundation out of business by fully embracing Web 2.0. I won’t hold my breath but I wouldn’t be unhappy with the situation. Government should be in the business of devising methods of both serving up its data and communicating its  so that the citizens it serves can use it as they see fit.