Earlier today, Speaker Boehner and Majority Leader Cantor and the Government Printing Office announced an improvement in how legislation is made publicly available. Starting in the 113th Congress, GPO will make all bills available for bulk download in XML format. While this doesn't change much from a technological perspective, it does mark a significant change from a policy perspective.
Starting with the technology, the public is already able to download legislation from GPO in XML format (and has been able to for all legislation since the 111th Congress). Prior to today's announcement, users who wanted to access this legislation systematically would have to write a short program to search through hundreds of pages to find the files. Now users can skip this step and download the information in one place, here. This is a trivial difference to a programmer, and no new information was released, but it has significant policy implications.
Today's announcement reflects the exercise of political will by the House leadership in furtherance of improved public access to legislative information. It grew out of years of frustrated efforts that turned to a more promising path in April 2011, when the House leadership declared that documents are a priority for the 112th Congress, and directed the Clerk to create docs.house.gov. (Public access is a bipartisan issue, with Republicans John Boehner, Eric Cantor, Darrell Issa, and Dan Lungren fighting side-by-side with Democrats Steny Hoyer, Mike Honda, and Mike Quigley.) This shift to relying on the House's internal expertise was key to making progress. Docs.house.gov was immediately successful when it launched at the end of 2011, and it provides an ever-evolving one-stop website where the public can access all House bills, amendments, and resolutions for floor consideration, among other things.
These successful intra-House efforts put pressure on legislative support agencies that traditionally were charged with disseminating congressional electronic information but had failed to keep up with the times and resisted efforts to modernize. After a significant controversy over the House's legislative branch appropriations bill last June, which was a legislative stalking horse for the role of the Library of Congress and GPO, House leaders issued an important statement that established a task force on bulk data and declared "our goal is to provide bulk access to legislative information to the American people without further delay."
In some respects, the Bulk Data Task Force was an unnecessary step, as it was clear (and had been for a while) what Congress needed to do. The fear among advocates was that the Task Force would be a graveyard for transparency efforts.
In the ensuing 3 months between the creation of the Task Force and its first meeting in September, a group of us issued a report containing recommendations on implementing public access to legislative information. In October, we had a very productive meeting with the Task Force about our recommendations. We were pleased with how engaged the group was on the issues and the wide array of interests that were represented. Incidentally, it is my understanding that the Task Force has been meeting regularly, with those meetings providing a unique forum for the many internal stakeholders within Congress to discuss their technology-related needs and activities.
While it was clear to us that many in the political leadership in the House are committed to electronic legislative transparency, there remain questions about the willingness of the legislative support agencies and the role of the Senate. Indeed, it is telling that the information published on GPO today only include House bills and not Senate bills even though Senate bill information is already available (in other ways) from GPO.
What today's move indicates is that GPO is willing to be responsive to the needs of the House. At the same time, the House demonstrated earlier this week that it will continue to push on these issues by itself, if need be, when it released House floor summaries for bulk download in XML going back to 2005. We also are seeing a further expansion of docs.house.gov at the turn of the new year as it has started to include committee information. The Clerk's office deserves credit for its continued efforts.
What's interesting about both the bulk access to House bills and bulk access to House floor summaries is that they give credit for efforts as coordinated or initiated by the Legislative Branch Bulk Data Task Force. While this week's improvements are incremental, they point down the right path. Indeed, we recommended that the House take an incremental approach to its technology projects -- to iterate quickly and do what it can immediately -- and must commend them for doing so.
Ultimately, this path should take us past the point where all legislative information published on THOMAS (or its successor Congress.gov) is available online, in real time, as structured data that is capable of being downloaded in bulk. The most requested data is legislative status information, which is held by the Library of Congress and still is not available today as structured data, bulk or otherwise. That includes when the bill was introduced, who co-sponsored it, a summary of the legislation, and so on.
Status information is prepared by the Library of Congress, which has been historically recalcitrant to make this information available to the public in any other formats besides as a series of webpages. But we know based on a March 2008 memo that hurdle here is political will, not technology. That's why today's announcement is encouraging. The task force is starting to crack open the vault. Let's hope that the Senate and the Library of Congress are coming to share the House's enthusiasm for transparency.