LOC

 

Why Are House Appropriators Not Webcasting Their Meetings?

The House Legislative Branch Appropriations Subcommittee just scheduled four budget hearings for next week, none of which will be webcast (according to their public notices). Just like last year, the hearings will be held in a tiny room in the Capitol that is often crowded past capacity. The public has a right to attend these meetings, and House Rules require that they be webcast (whenever practicable).

So what does "practicable" mean? When we surveyed how frequently committees webcast their hearings last year, we found that House Appropriators stood out for the absence of transparency of their proceedings.

The Sunlight Foundation tracked 200 House hearings over 20 days to determine whether they were webcast live, plus 407 hearings from January 17 to April 2 to determine whether video from the proceedings were archived online. Twenty-five percent (48 of 200) of the hearings were not live-streamed, and 22 percent (91 of 407) were not archived on committee websites.
Of the 48 hearings that were not live-streamed, 47 were Appropriations Committee hearings (Armed Services was the other one). Similarly, of the 91 hearings that did not have video archived on the committee website, 74 were Appropriations Committee hearings.

This is an intensely frustrating and longstanding problem.

I'm singling out the Legislative Branch Appropriations Committee's budget hearings on GPO, LOC, GAO, and CBO because we at the Sunlight Foundation care a lot about the legislative support agencies, particularly as they empower a lot of federal transparency. (And they've been actively working on government transparency issues, and there's more that we'd like them to do.)

But it's unfair to single them out. A quick look at the upcoming hearings and meetings for the Appropriations Committee finds meeting after meeting that won't be webcast. The hearing on nuclear nonproliferation? Won't be webcast. Indian education? Nope. Army Corps of Engineers? Out of luck. Of the ten upcoming hearings that indicate webcasting status, 2 will be webcast and 8 (including a closed hearing) will not.

With the budget crisis, impending sequester, and questions about federal spending, how is it that the committee most responsible for spending money is the one that's least likely to put its meetings online? We've seen a commitment from the House leadership to do better, and I hope that the Appropriations Committee will find a way to make that happen.

GPO is Closing Gap on Public Access to Law at JCP's Direction, But Much Work Remains

The GPO's recent electronic publication of all legislation enacted by Congress from 1951-2009 is noteworthy for several reasons. It makes available nearly 40 years of lawmaking that wasn't previously available online from any official source, narrowing part of a much larger information gap. It meets one of three long-standing directives from Congress's Joint Committee on Printing regarding public access to important legislative information. And it has published the information in a way that provides a platform for third-party providers to cleverly make use of the information. While more work is still needed to make important legislative information available to the public, this online release is a useful step in the right direction.

Narrowing the Gap

In mid-January 2013, GPO published approximately 32,000 individual documents, along with descriptive metadata, including all bills enacted into law, joint concurrent resolutions that passed both chambers of Congress, and presidential proclamations from 1951-2009. The documents have traditionally been published in print in volumes known as the "Statutes at Large," which commonly contain all the materials issued during a calendar year.

The Statutes at Large are literally an official source for federal laws and concurrent resolutions passed by Congress. The Statutes at Large are compilations of "slip laws," bills enacted by both chambers of Congress and signed by the President. By contrast, while many people look to the US Code to find the law, many sections of the Code in actuality are not the "official" law. A special office within the House of Representatives reorganizes the contents of the slip laws thematically into the 50 titles that make up the US Code, but unless that reorganized document (the US Code) is itself passed by Congress and signed into law by the President, it remains an incredibly helpful but ultimately unofficial source for US law. (Only half of the titles of the US Code have been enacted by Congress, and thus have become law themselves.) Moreover, if you want to see the intact text of the legislation as originally passed by Congress -- before it's broken up and scattered throughout the US Code -- the place to look is the Statutes at Large.

In 2011, GPO published 58 volumes of the Statutes at Large, covering 1951-2009, but did not break the volumes down into their constituent documents. Up until that point, the public laws were available as individual documents on THOMAS from 1989 to present as HTML (and PDF in some instances), and from 1789 to 1875 as TIFF (unwieldy image) files from the Library of Congress. Even with this recent release, 76 years of federal law are still unavailable online in any format from any official source; and the files released for the years 1789 to 1875 by the Library of Congress are difficult to use.

Read more

House Convenes Second Public Meeting on Legislative Bulk Data

On January 30th, the House of Representatives held a public meeting on its efforts to release more legislative information to the public in ways that facilitate its reuse. This was the second meeting hosted by the Bulk Data Task Force where members of the public were included; it began privately meeting in September 2012. (Sunlight and others made a presentation at a meeting, in October, on providing bulk access to legislative data.) This public meeting, organized by the Clerk's office, is a welcome manifestation of the consensus of political leaders of both parties in the House that now is the time to push Congress' legislative information sharing technology into the 21st century. In other words, it's time to open up Congress.

The meeting featured three presentations on ongoing initiatives, allowed for robust Q&A, and highlighted improvements expected to be rolled out of the next few months. In addition, the House recorded the presentations and has made the video available to the public. The ongoing initiatives are the release of bill text bulk data by GPO, the addition of committee information for docs.house.gov, and the release on floor summary bulk data. It's expected that these public meetings will continue at least as frequently as once per quarter, or more often when prompted by new releases of information.

As part of the introductory remarks, the House's Deputy Clerk explained that a report had been generated by the Task Force at the end of the 112th Congress on bulk access to legislative data and was submitted to the House Legislative Branch Appropriations Subcommittee. It's likely that the report's recommendations will become public as part of the committee's hearings on the FY 2014 Appropriations Bill, at which time the public should have an opportunity to comment.

Read more

Access to Legislation Gets Better, Promise of More to Come

Earlier today, Speaker Boehner and Majority Leader Cantor and the Government Printing Office announced an improvement in how legislation is made publicly available. Starting in the 113th Congress, GPO will make all bills available for bulk download in XML format. While this doesn't change much from a technological perspective, it does mark a significant change from a policy perspective.

Read more

Keeping Authentication Simple

The point of publishing bulk data is so it can be reused as widely as possible. This is particularly true for government data, which belongs to the public.

Government agencies can sometimes also be concerned with ensuring the authenticity of their legal information - especially when the data might be seen as an official source. It breaks down into two major concerns: integrity (ensuring the text is accurate), and origin (proving it's official). A lot of people are used to the "wax seal" model of authenticity - the experience of opening a PDF and seeing that the document is signed and official. This model quickly breaks down for distributing bulk data.

The goals of ease of reuse and authentication are frequently presented as being in tension, but that tension is just as frequently overstated. There are straightforward approaches to guaranteeing authenticity of bulk data that do not encumber reuse.

Read more

Congress launches THOMAS successor Congress.gov

Seventeen years after the creation of THOMAS, Congress today launched a sleeker, more intuitive and user-friendly legislative information website, beta.congress.gov.

What's noticeable about this evolving beta website, besides the major improvements in how people can search and understand legislative developments, is what's still missing: public comment on the design process and computer-friendly bulk access to the underlying data.

We hope that Congress will now deeply engage with the public on the design and specifications process and make sure that legislative information is available in ways that most encourage analysis and reuse.

It's also worth remembering what the Library of Congress said in 1996 as it considered what should be included in its legislative information system:

To be most useful to Members of Congress, the legislative information system must provide access to a wide range of current and historical information, including existing statutes, support agency analyses, academic studies, court decisions, budget and financial data, regulations and executive branch policies, public and private sector analyses, lobby group position papers, and newspaper reports from local, national, and international sources.

We will have more to say as we dig deeper into the website. The Library of Congress' news release is below.

LOC News Announcement on Beta.Congress.gov

After 578 Days, Where's the Constitution Annotated?

578 days ago, Congress directed that the legal treatise Constitution Annotated be published online, but it's still not available. The Constitution Annotated, aka CONAN, is a 100-year-old continuously updated congressional report that explains the US Constitution as it has been interpreted by the Supreme Court. With so many important rulings coming out of the High Court, it's important to understand the effect of its decisions on the Constitution.

Here's what Congress, via the Joint Committee on Printing, required in a November 17, 2010 letter:

Update the online edition [of the Constitution Annotated] as frequently as possible, and to create new and improved functions on the CONAN site. The Congress and the public should find this site accessible and user-friendly.

The master file for CONAN is updated frequently and is available as a website accessible only to Congress. (The public version is updated only once a decade and is released in a barely usable format, which is why JCP sent the letter in the first place.) Many organizations have asked that CONAN be published online in its original (XML) format. JCP has directed that it be published online in a timely fashion, but in the less-useful PDF format. (It would be fine to publish it in both.)

This shouldn't be a particularly hard project, so we can only help but wonder why there's been such a long delay, and how much longer we'll have to wait? As an interim measure, it may be simplest for Congress simply to release to the public what it already publishes on the Congress' internal website. That should require the technological equivalent of flipping a switch.

This upcoming year, CONAN will be up for its once-a-decade print edition. With at least 4,870 statutorily mandated copies, at an guesstimated cost of $226 per copy, the House and Senate will pay over $1.1 million to prepare a document that will go out of date almost immediately. (Even assuming that 60% of the costs are for layout, which is necessary for an online edition as well, that's still $440,000 to print a very heavy doorstop.)

Some of these costs may be avoided by asking Congressional offices whether they prefer a paper version or electronic access, as is the practice with other legislative documents. But the bigger question is: what's taking so long? Is this a sign of bigger problems inside the Library of Congress and GPO? When will this finally be finished?

It looks like we'll have to continue to wait and see.

Media Spotlight on Congress Stalling Open Access to Legislation

The media's magnifying glass is concentrating attention on actions by the House Appropriations Committee that could stall progress on the public's access to legislative information. The Sunlight Foundation and our allies continue to push Congress to stop dragging their feet and join the 21st century by allowing developers access to open legislative data to build the tools to keep citizens informed about what their government is doing.

Please find and call your Representative at 202-224-3121 or write to reinforce the American public's hunger to read and follow legislation. Here are some excerpts from recent media coverage on this important transparency issue:

Roll Call reports on Republican House leadership's strong support for bulk access and quotes Rep. Crenshaw misunderstanding the issue of authentication:

“The Speaker pledged to make the 112th Congress the most open and transparent Congress in history and to make legislative data available online and in bulk,” said Michael Steel, spokesman for Speaker John Boehner (R-Ohio). “He continues to look for the best way to do that.”

“Facilitating public access to bulk legislative data ... has been and will continue to be a priority for this committee,” echoed Salley Wood, spokeswoman for the House Administration Committee. But lawmakers’ hands would be tied until a task force could be convened and report back on its findings, according to the House report language.

“We wanted to create a system where we could have this available but also make sure we protect the authenticity and integrity of all this information,” said Rep. Ander Crenshaw (R-Fla.), chairman of the Appropriations Subcommittee on the Legislative Branch.
The Washington Examiner addresses the committee's confusion over how citizens use and should access government information:
Folks with computers -- notably, professional and citizen journalists -- would be able to take information about massive numbers of bills and analyze them in myriad ways -- if Congress would allow such information to be downloaded from THOMAS in bulk.

It won't. And, according to a new draft report from the House Appropriations Committee, it won't be allowing bulk data downloads from THOMAS anytime soon.

Instead of taking a step towards greater transparency, the committee got hung up on whether people would know if the data they're seeing on the Internet were accurate and really from Congress -- "authentication," they call it.
FierceGovernment notes the lack of a deadline for decision making:
The report retains language decried by transparency opponents that would indefinitely postpone public bulk downloads of legislative information in XML. Good government groups, including the Sunlight Foundation, have pressed for the Library of Congress to release the bulk data used to track legislative developments in the library's THOMAS website, arguing that they could do a better job of presenting information.
TechPresident reports on the frustration among transparency advocates:
Open government advocates are up in arms over what appears to be another attempt by government bureaucrats to stall the move to enable bulk data downloads of legislative information online.

Slashdot opens the issue for conversation to their community:

The House Appropriations Committee is considering a draft report that would forbid the Library of Congress to allow bulk downloads of bills pending before Congress. The Library of Congress currently has an online database called THOMAS (for Thomas Jefferson) that allows people to look up bills pending before Congress. The problem is that THOMAS is somewhat clunky and it is difficult to extract data from it. This draft report would forbid the Library of Congress from modernizing THOMAS until a task force reports back. I am pretty sure that the majority of people on Slashdot agree that being able to better understand how the various bills being considered by Congress interact would be good for this country.

Legal Informatics also has a nice collection of blog posts on this issue.

Follow the latest developments here.

Bulk Access Language Tweaked by Approps

The House Appropriations Committee had apparently tweaked its report language regarding bulk access to legislative information. The  report approved by the Committee has been replaced on website, but I have a copy of the original. Here's how the final paragraph has changed:

 Accordingly, and before any bulk data downloads of legislative information are authorized, The Committee directs the establishment of a task force composed of staff representatives of the Library of Congress, the Congressional Research Service, the Clerk of the House, the Government Printing Office, and such other congressional offices as may be necessary, to examine these and any additional issues it considers relevant and to report back to the Committee on Appropriations of the House and Senate.

What does this mean? It's responsive to one of the concerns we raised about the language, that "the report language is terribly overbroad: it prohibits the establishment of bulk data downloads of legislative information prior to the reporting back of the task force."  At least, as a matter of law, efforts around bulk access will not be frozen. Given that this restrictive language was inserted in the first place, it remains to be seen whether efforts around bulk data will continue as a matter of practice. (We have some reassurance on this count from the Speaker's Office.)

All the other concerns we raised before remain:

  • Why doesn't the task force include non-governmental participants if its focus is releasing information to the public?

  • When must it report back? There's no deadline for action.

  • Will draft reports be made available to the public for comment? Will meetings be open?

  • Why will the final report only be given to appropriators? It should be available to all members of Congress and to the public as well.

  • How are the issues entrusted to this task force any different form the issues already addressed by the Library of Congress in this 2008 memo? Where are any follow-on reports, contemplated in that memo, that engaged in "an examination of permanence and authentication of legislative data, along with any attendant issues, risks and workload?"

While we'd prefer the task force be open and transparent, to a large extent it is a red herring. The issues that it has been tasked have either been addressed previously or are largely irrelevant. It's important that people continue to call and write their members of Congress.

 

Below the jump is the full text of the revised report language.

During the hearings this year, the Committee heard testimony on the dissemination of congressional information products in Extensible Markup Language (XML) format. XML permits data to be reused and repurposed not only for print output but for conversion into ebooks, mobile web applications, and other forms of content delivery including data mashups and other analytical tools. The Committee has heard requests for the increased dissemination of congressional information via bulk data download from non-governmental groups supporting openness and transparency in the legislative process. While sharing these goals, the Committee is also concerned that Congress maintains the ability to ensure that its legislative data files remain intact and a trusted source once they are removed from the Government's domain to private sites.

The GPO currently ensures the authenticity of the congressional information it disseminates to the public through its Federal Digital System and the Library Congress's THOMAS system by the use of digital signature technology applied to the Portable Document Format (PDF) version of the document, which matches the printed document. The use of this technology attests that the digital version of the document has not been altered since it was authenticated and disseminated by GPO. At this time, only PDF files can be digitally signed in native format for authentication purposes. There currently is no comparable technology for the application and verification of digital signatures on XML documents. While the GPO currently provides bulk data access to information products of the Office of the Federal Register, the limitations on the authenticity and integrity of those data files are clearly spelled out in the user guide that accompanies those files on GPO's Federal Digital System.

The GPO and Congress are moving toward the use of XML as the data standard for legislative information. The House and Senate are creating bills in XML format and are moving toward creating other congressional documents in XML for input to the GPO. At this point, however, the challenge of authenticating downloads of bulk data legislative data files in XML remains unresolved, and there continues to be a range of associated questions and issues: Which Legislative Branch agency would be the provider of bulk data downloads of legislative information in XML, and how would this service be authorized. How would `House' information be differentiated from `Senate' information for the purposes of bulk data downloads in XML? What would be the impact of bulk downloads of legislative data in XML on the timeliness and authoritativeness of congressional information? What would be the estimated timeline for the development of a system of authentication for bulk data downloads of legislative information in XML? What are the projected budgetary impacts of system development and implementation, including potential costs for support that may be required by third party users of legislative bulk data sets in XML, as well as any indirect costs, such as potential requirements for Congress to confirm or invalidate third party analyses of legislative data based on bulk downloads in XML? Are there other data models or alternative that can enhance congressional openness and transparency without relying on bulk data downloads in XML?

The Committee directs the establishment of a task force composed of staff representatives of the Library of Congress, the Congressional Research Service, the Clerk of the House, the Government Printing Office, and such other congressional offices as may be necessary, to examine these and any additional issues it considers relevant and to report back to the Committee on Appropriations of the House and Senate.