House Begins Publishing Committee Data

by and

The House of Representatives’ document portal,, launched in January 2012 with a surprisingly rich and relevant set of data: all bills and amendments (including drafts) that would come to the floor over the next week, and extensive XML metadata about each document and when it was updated. It’s pretty difficult to overstate the value of this data. After all, information on what the House is about to do is vital — to participate effectively in our democracy, you need to have some lead time.

The House has doubled down on its pledge to keep innovating, and has begun to release what promises to be an expansive set of committee information.’s expansion in breadth from floor proceedings to include committee activities provides significant new opportunities for the public to understand how the House functions as well as a much earlier entry point for citizens to become substantively involved in the legislative process. is organized around a new calendar of committee activities that extends what’s available on The calendar identifies committee activities further in advance than the current system and provides a landing page with extensive information and documents related to committee activities such as the names of witnesses, written testimony, draft legislation, and so on. In addition, each committee activity has associated XML with structured information on both the activity and all related documents, so that developers can easily access and reuse the information. This more than satisfies our recommendation that the House improve how it gives notice about upcoming committee activities.

All documents contained in this portal can be searched and filtered by committee and subcommittee (here’s documents from House Rules, for example), and every committee and subcommittee has its own RSS feed (like this one). It’s still not perfect: for example, it’s not obvious how one could automatically discover the available XML on the site without scraping any HTML to discover associated IDs and URLs. But this could be addressed by offering a full XML feed of activity, like the House’s Floor page already does.

Taken together, these additions to provide both a useful set of data and a promising new scope for this important legislative information portal. Our experience has taught us that gathering information in any automated way about House and Senate committees is an extremely frustrating experience, because every committee has its own website and its own way of doing things. Because of this, even while House and Senate floor votes are posted quickly and centrally, we’ve ended up in a situation where the votes members of Congress take while in committee are in no timely, central location. It’s easy to imagine evolving to become an incredibly useful guide that connects citizens to the information they seek. While it will never replace committee webpages – nor should it – will help ensure that committee information is made more prominent among the activities of the House.

One additional noteworthy aspect of is that it is built and maintained by the Clerk of the House. This means that the information it contains is non-partisan and should persist over time. While committee websites are often wiped clean when a new chairperson takes power, should provide a measure of institutional memory independent of leadership and party. This is a smart move. In addition, we’ve noticed that some of the legislative support agencies have been unwilling or unable to play the role of a central legislative document clearinghouse. Having the Clerk’s office serve as a clearinghouse has managed to sweep aside all the bureaucracy and allow tangible progress to be made. Let’s hope that both the Senate and the legislative support agencies follow this example, now that the House has demonstrated what’s possible.

written by Eric Mill and Daniel Schuman