This post reflects the state of agency data listings on Dec 3, 2013 and will not be updated as agencies continue to comply with the open data executive order over the course of 2014. More posts about compliance with the executive order can be found here.
The first major deadline for agency compliance with President Obama's open data Executive Order arrived this past Saturday. Agencies were required to, among other things, provide the Office of Management and Budget with an "Enterprise Data Inventory" and release a list of all their public data via a /data page on their websites.
We had hopes that some agencies might choose to publicly release their entire Enterprise Data Inventories, providing a full picture of their data holdings. Unfortunately, so far, that does not seem to have happened. Until the full inventories are available, the public will still be stuck in the dark, not knowing what we don’t know about government data holdings.
Nonetheless, most cabinet level agencies, as well as a number of independent agencies that were not required to comply, have taken steps to publicly fulfill the other aspects of the Executive Order. Levels of compliance have been varied, but we will try to highlight some of the worst and best examples below.
Poor compliance is bad, no compliance at all is simply unacceptable. Three Cabinet level agencies deserve special recognition for not even bothering to release public data listings. The Departments of Commerce, Defense, and Veterans Affairs have completely failed to release data.json files or update their web pages with new information, despite having data sets publicly available on data.gov. This utter lack of compliance is a sign that these agencies either don’t care, or aren’t competent enough, to comply with the wishes of the White House.
Saturday’s deadline required agencies to include all of their publicly available data sets in their public data listings. This, presumably, would include all of the data sets that an agency makes available on data.gov. However, several agencies included significantly fewer data sets in their listings than they post on data.gov.
The Departments of Agriculture, Homeland Security, and Treasury all have relatively small numbers of data sets available through data.gov and even fewer documented in their public data listings. The Department of Transportation has a robust presence on data.gov, with over 2,000 data sets available. By contrast, their data listing includes more than 400 fewer entries. This failure to meet that minimum standard does not give us high hopes for these agencies’ ability to provide high quality information moving forward.
Agencies were required to release their data listings in both “human-readable and machine-readable,” formats. Most agencies were able to release their listings in JSON format, complying with the “machine-readable” aspect of the guidance, but only a few took the “human-readable” requirement to heart. The next section will kick off by highlighting those examples.
Just putting data up online isn’t enough to make it truly publicly accessible in a meaningful way – it must be available as part of a functional, user-friendly interface. Several agencies have made good progress in designing accessible user interfaces, creating useful search functions, and making their data listings available in both well-organized human and machine readable formats.
The Department of Education, for example, offers an excellent search function that allows the user to easily do a keyword search of all the data in their public listings. Many other sites lack this functionality, but it is key to making data truly publicly accessible. The DOE makes this even easier by succinctly translating the metadata about each data set from its JSON file into this searchable system.
The Environmental Protection Agency has well-organized outlines that divide up its data sets into issue areas, for example: “air,” “food safety,” “health effects,” and “wastes.” Issue areas are then further divided into subcomponents, for example, “wastes” lists “hazardous waste,” “liquid waste,” and “solid waste.” This allows the user to more easily locate relevant data sets. The site also has a search capability, though it doesn’t appear to be as functional as the Department of Education’s search tool.
Most agencies, the Department of Energy and the National Science Foundation to name just two, do not offer human readable formats of their listings, choosing instead to meet the minimum level of compliance. Moving forward, it would be nice to see more agencies take a page from the play book of the DOE, the EPA, and the Department of Justice by putting up well formatted versions of their data listings on their /data pages.
While we’ve seen that some agencies failed to even migrate their data.gov holdings to their public data listings, others have far surpassed this goal.
In addition to including an easily readable list of data as well as their data.json file, the Department of Justice included information about 690 data sets, over 500 more than they currently make available on data.gov.
That’s impressive, but nothing compared to the Department of Interior which has included information on a whopping 65,804 data sets in their listing. That’s nearly 6 times the, already impressive, number of data sets they make available on data.gov.
This analysis barely scratches the surface of agency compliance, and failure to comply, with the Open Data Executive Order. Over the coming days we’ll be digging into the metadata and exploring the documents to see which agencies are going above and beyond the guidance, and which agencies are merely paying lip service to it.
In the meantime, you can check out our research document and find links to all the relevant pages in the chart below.