Today 30 organizations from across the political spectrum joined together to ask Congress to improve public access to legislative information. Our joint letters to congressional appropriators and rulemakers urges Congress to direct that the THOMAS legislative database be published online and to establish an advisory committee on further improvements.
THOMAS, Congress' legislative information website that provides basic information about legislative and congressional actions, has fallen far behind the needs of its users. Many have turned to important websites like GovTrack, OpenCongress, and WashingtonWatch to monitor congressional activities.
These sites and others, which repackage and add important context to legislative activities, extract data from the THOMAS website through a painstaking and often brittle process. To make this process easier and more reliable, the Library of Congress should publish THOMAS information "in bulk," which makes the entire legislative database available for download at once, instead of publishing information in such a way that it can only be gathered by scraping data from hundreds or thousands of webpages.
Bulk access to legislative information is already common practice inside and outside the government. For example,
- The Government Printing Office publishes six major databases, including the Federal Register, in bulk;
- The House's Office of Law Revision Counsel publishes the U.S. Code in bulk;
- New Jersey and New Hampshire publish their legislative information in bulk; and
- Data.gov has nearly 400,000 datasets available in bulk, including 4,395 high-value datasets.
The transparency community, technology innovators, journalists, good government organizations, and private companies have long sought bulk access to legislative data. In May 2007, a coalition of organizations called on Congress to "embrace structured data by publishing the status of legislation and other information to the web ... in structured data formats". In 2009, Congress articulated support for bulk access to legislative data in an explanatory statement accompanying an appropriations bill. And in November 2011 one of the action items emerging from the House's Congressional Facebook Hackathon was an endorsement of releasing "structured machine-readable legislative data ... in a bulk format."
This past year the Sunlight Foundation, GovTrack, and Open Congress submitted testimony to House Appropriators calling for bulk access to legislative information. We applaud the major strides made by the House of Representatives in improving public access to the House's legislative information, but what's missing is the kind of information only available through the THOMAS website. This includes bill summaries, bill status information, bill co-sponsors, and other information that provides important context for legislation.
We estimate that for every person that goes directly to the THOMAS website, at least two people visit a third-party website. But even these sites must rely on legislative information generated and maintained by Congress, which is only available through the difficult-to-use THOMAS website. There will always be a need for a congressionally-mandated website, but Congress should ensure that the innovative and transformative uses of legislative information by third parties is grounded upon accurate and timely data. And that means providing bulk access to everyone.