One year later, Data.gov bigger but needs to get better
One year ago, the U.S. government launched Data.gov, a central plank in its Open Government initiative to make it easier for the public to find and use official datasets. The site has grown from an initial 47 databases to more than 272,000, and attracted nearly 100 million hits. It inspired eight American cities – including San Francisco and New York City – eight states, and six other nations to launch similar sites of their own. By most metrics, the project has been a success.
But government transparency advocates say the site has plenty of room for improvement. Gabriela Schneider of the Sunlight Foundation said that while the site is a good concept, it needs to be stocked with more useful datasets. “Over 270,000 of the 272,000 datasets on Data.gov are very specific geodata — essentially shapefiles that can be used to draw maps. [These are] not new and not really useful for researchers, developers and journalists,” she said. “There’s about 1,500 datasets that we actually care about and find valuable. A year into its existence, Data.gov should be cataloging much more than that.”
Data quantity is not her only concern, Schneider said. The site would be greatly improved if it added more data explanation and interpretation. “We’d love for the government to focus on improving the quality of the data, and also the metadata quality,” she said. “I have no idea how this data is being curated, how often is it being reviewed, how often is it being updated.”
Schneider said the one-year anniversary of Data.gov was accompanied by too much of a focus on the “shiny” parts of the site — such as its design — rather than a reinvigorated push for more information.
Federal government officials in the General Services Administration, Office of Science and Technology Policy, Department of Interior, and Chief Information Officer’s office, among others, did not respond to requests for comment from the Center.
When the Obama administration’s chief information officer, Vivek Kundra, wrote a post for the Open Government blog celebrating the one-year anniversary of Data.gov, he encouraged citizens to use Twitter to suggest more data for the site to publish.
Bryan Rahija of the Project on Government Oversight shares Schneider’s worry about data quality. “A concern that we have about the whole set-up is [that] agencies are [either] releasing information of convenience or information of substance, and so far it has leaned to information of convenience.” For example, he suggested that the Interior Department could have released data about oil drilling permit holders in the Gulf of Mexico, but has yet to release any data sets relevant to the oil spill.
Rahija also said he had problems with the formatting of some data sets, and when he asked for help, he met with a slow response. “Back in February I wrote the Energy Department asking, ‘do you think you can put [a database] up in XLS format?’” he said. “And I received an e-mail three or four weeks ago saying, ‘Hey, we’re working on that. No news yet.’ We’re four months into that request. I think it could go faster.”
Overall, both said the Data.gov project is certainly a step in the right direction.
“I would hate to criticize, because the fact is they’re making this information more available,” Schneider said. “But the next step is to make it more useful.”
Added Rahija: “I think it’s important that the government lay some sort of a framework for this information to be out there, so it’s not all bad news.”
ABOUT THE DATA:
What: Thousands of U.S. agency records and data
Availability: Many different databases available for download