Executive branch data listings should be human readable

Late last year, when we took a look at how well agencies were complying with the first deadlines tied to President Barack Obama’s open data executive order we were impressed by the fact that most of the cabinet level agencies (along with a number of independent ones) released their data.json files on time. Unfortunately, another requirement in the executive order guidance is proving a little more time consuming for agencies to accomplish.

The guidance requires agencies to set up a /data page on their websites. The page should include a human readable “table/list of each data set,” which are intended to serve as the “authoritative source of publicly available agency data.” As of this writing, only a few agencies include such a list on their /data page.

It is understandable that agencies chose to put this requirement off in favor of building out their JSON files and getting in sync with data.gov’s system. However, it is important that agencies recognize these human readable lists as a vital part of complying with the executive order. These lists will allow less tech-savvy data consumers to fully engage with the scope of an agency’s data holdings. Without an easily readable list, consumers will either need programming knowledge or a deep understanding of the often complicated workings of data.gov to effectively utilize the new data listings.

These JSON files appear to be nothing more than unending strings of characters to the untrained eye. It’s not likely that non-programmers would able to quickly parse an inventory in this format to find what they are looking for.

A screenshot from the Department of Labor's public JSON of its data inventory
Screenshot of the Department of Labor JSON public data inventory.

In contrast, some agencies have made the open and accessible display of these data sets a priority. The Department of Justice’s data inventory is a great example. Its data sets have a clearly delineated title — meaning web viewers don’t have to dig through line after line of unformatted text to find what they’re looking for — as well as a short summary paragraph, contact information, a hyperlink for downloading the data and other information.

A screenshot of the Department of Justice's data page featuring the metadata which includes title, contact information and a URL for downloading the data set among other information.
Screenshot of the Dept. of Justice’s public data listing website

This display allows academics, journalists and others searching for information on Department of Justice website to readily find data sources matching their search terms.

Such agencies have designed their /data pages around robust, readable lists. In the short term, however, the solution can be as simple as hosting a spreadsheet. It could take mere hours for an adept programmer to pull the information in an agency’s data.json file into an easier to read spreadsheet that could then be hosted on the corresponding /data page.

With Open Data Day (Feb. 22) fast approaching, we think that conversion of some of the hairier JSON data files into a more human readable format would be a worthy task for any programmer interested in freeing government data.

The departments of Commerce, Defense and Homeland Security are all among the agencies missing a full web based public data listing — conversion of their JSON inventory could mean that scores of new researchers could access these valuable troves of data.