The Missing Open Data Policy

by

Open data policies aren’t doing nearly as much good as they can, because they usually fail to require new information to be disclosed.  To fix this, governments should make their information policy decisions as publicly as possible, indexing their major information holdings, and publicly determining whether or not to release information.

Most newly implemented open data policies, much like the Open Government Directive, are announced along alongside a package of newly released datasets, and often new data portals, like Data.gov. In a sense, these pieces have become the standard parts of the government data transparency structure.  There’s a policy that says data should generally be open and usefully released, a central site for accessing it, some set of new data, and perhaps a few apps that demonstrate the data’s value.

Unfortunately, this is not the anatomy of an open government.  Instead, this is the anatomy of the popular open government data initiatives that are currently in favor. Governments have learned to say that data will be open, provide a place to find it, release some selected datasets, and point to its reuse.

What gets left out of these initiatives, however, is often the most important part — the decisions as to what gets released, and how.  Many open government data discussions skip over the question of whether governments are deciding appropriately what gets released and what doesn’t.  Instead of making complex decisions about what should be released, central governments suggest that those decisions are hard, and that as long as there’s always some new information, then we’re making progress that deserves praise.

Progress or not, open data policies often pretend to be something they aren’t.  The Open Government Directive is simply dismissed or ignored by agencies who decide not to release information, as we’ve often pointed out before.

In the face of this shortfall, we at Sunlight have tried to focus on real decisions about actual datasets, and to force agencies to do the same. While that’s proved difficult to do, as existing requirements like the Paperwork Reduction Act, the OGD, and the Presidential Memorandum on Regulatory Compliance are often ignored, agencies do respond when pushed on substantive, particular issues.

So we’re not giving up on forcing agencies to make information policy decisions in public. One of the most important things that governments can do to be more transparent is to list, or index all of their information holdings online.  CIOs should be more than just technology purchasers; the word information is in their title. Every agency should have a public list of its major information holdings, along with a description of whether it’s public or not, and why. Without creating such a list, how do Chief Information Officers even do their jobs?

Now, the question “where is all of our information” can be a tricky one to answer, but agencies can rely on threshold definitions.  For example, any database with a maintenance cost over a certain number should be listed.  Any information specifically described in a statute governing the agency should be described.  Any form, report, or data described in the regulations governing the agency should be described.  Whether the information is usually (or never) accessible via FOI request should be noted, and whether bulk data is available through a central portal should be spelled out as well. (By far, the best example of such a review that we’ve seen is the DOT regulatory compliance plan, and the closest we’ve found for Congress is this.)

If the public, and if the Congress (or other legislatures) are to be involved in creating a more open government, we need to be able to measure openness against a background that makes sense.  Governments ask to be measured against the failures of the past, but that’s just an insufficient standard by which to judge transparency reform.

Comprehensive indexes and audits of agency data force governments to make publicly accountable decisions about what is public and what isn’t. Lists of government data shouldn’t just include already public offerings, either. (New York’s new open government law makes just this move, requiring lists of “public” data, allowing exceptions for anything that might be withheld for any reason, and ignoring all the information that should probably be public, but isn’t.) If we can’t see the decisions governments are making about what to release, then we can’t change them.  FOI laws provide a basic instrument, but a broad mandate that places the data management burden on agency officials could systematically open far more information than ad hoc requests are ever likely to.

If governments can build data portals, hold competitions, and spend huge sums of money on complex data systems, they should be able to build public lists of those systems, along with a description of what is public and what isn’t. Archivists have done this for a century for old records, and it’s time for a similar amount of rigor to be applied to transparency policy decisions.

The missing open data policy is the one that says list your data, and say which of it is open, and why.