Sigh. I feel like a disappointed parent.
When the details of the Open Government Directive were announced early last December I was unbelievably excited. Seriously. My long time hope that one day government would get “it” about the importance of putting public information online appeared to have arrived. Government data was going to become available as a default and that was going to start with an “inventory” (government's word) of the “high value information” (also their words, though less than ideal because who would ever agree what that means?).
Agencies were supposed to do two things with respect to releasing data: create an inventory of the “high-value information” currently available for download and identify high value information not yet available along with establishing a reasonable timeline for publication of that data online. It was that latter requirement that I salivated over. Certainly there are other important aspects of Open Government -- participation and collaboration are values we hold dear at the Sunlight Foundation. But yesterday was the day when the rubber was supposed to hit the road on data. For many agencies, they didn't even get out of the garage.
There are some very interesting data that's going to made available, almost immediately (and John Wonderlich, our Policy Director, has a post on it) but some agencies avoided the requirement entirely, some decided to say they'd make a plan to plan how to identify and release data, and others mentioned it but didn't explain how they would achieve it.
First, our quick review shows that a little more than half of the 30 agencies' plans we reviewed (18) specifically identified new data to be released -- 12 did not. (This includes some independent agencies.) The total number of data sets identified to be released -- approximately 89.*
89 data sets identified for release - across the entire federal government!? I'm speechless. I was looking for inventories of data (this is the Directive's word, after all) -- actual audits of what data each agency collects and dates of when new information would be made available. That is not what we got.
The Department of Health and Human Services (HHS) was among the best - identifying 14 new data sets to be released - and this is crucial data. While maintaining the privacy and identity of patients HHS will be releasing critical data about Medicare: everything from inpatient hospital stats to prescription drugs and hospice care. During an era on increased responsibilities for HHS this data is absolutely critical to keeping HHS effective and accountable.
Few agencies rose to the high water mark of HHS. Part of the problem might be attributable to cultural barriers and the illusion that some bureaucrats hold that this is "their" data vs "all of our" data. Part of the problem might have been time to pull the information together.
Maybe, a bigger part of this problem is a loophole in the Open Government Directive itself. By asking agencies to only inventory their "high-value" data it gave them an instant out for just about anything. Despite the White House's good intention in defining high-value as: "increase agency accountability and responsiveness; improve public knowledge of the agency and its operations; further the core mission of the agency; create economic opportunity; or respond to need and demand as identified through public consultation."
With a definition like that "high-value" could mean literally anything: if you collect a piece of data that is not to "further the core mission of the agency" why are you collecting it?
When you define a concept too broadly you end up not defining it at all. If we could roll back the clock on the Open Government Directive we would ask agencies to first list all data they collect and then create sub-lists of:
- data that is currently public but not online
- data that is currently public and online
- data that is not currently online but that will be put online and when
- for everything else, explain why it won't be put online
This would give us an instant picture of what the online (and therefore, public) landscape of federal government looks like and is an invaluable data set in its own right.
HHS, NASA, Education, National Archives and Records Administration and the Office of Personnel Management were the high water marks.
Defense, Homeland Security, Justice, State, Interior, Treasury, Veterans Affairs, US Agency for International Development and the Social Security Administration did not identify any new data to be made available - no inventories either.
Yes, I appreciate the extraordinary hard work put into the Open Government Directive by all those in the agencies and those spearheading it at Office of Management and Budget and the White House, and I wouldn't suggest that evaluating these plans based on just one of a couple dozen appropriate criteria is a totally fair reading of how successful this exercise was, but I have to look at it from what I feel is key for government accountability - data. That's my lens on the world.
We'll continue to evaluate agency plans all next week.
*We arrived at the 89 number via a very generous methodology. It all depends on how you define a "data set". Our complete inventory using a more exact methodology will be available soon.
Photo credit: "Idling" by Flickr user N1NJ4.