Last year, a group of us who work daily with open government data -- Josh Tauberer of GovTrack.us, Derek Willis at The New York Times, and myself -- decided to stop each building the same basic tools over and over, and start building a foundation we could share. We set up a small home at github.com/unitedstates, and kicked it off with a couple of projects to gather data on the people and work of Congress. Using a mix of automation and curation, they gather basic information from all over the government -- THOMAS.gov, the House and Senate, the Congressional Bioguide, GPO's FDSys, and others -- that everyone needs to report, analyze, or build nearly anything to do with Congress. Once we centralized this work and started maintaining it publicly, we began getting contributions nearly immediately. People educated us on identifiers, fixed typos, and gathered new data. Chris Wilson built an impressive interactive visualization of the Senate's budget amendments by extending our collector to find and link the text of amendments. This is an unusual, and occasionally chaotic, model for an open data project. github.com/unitedstates is a neutral space; GitHub's permissions system allows many of us to share the keys, so no one person or institution controls it. What this means is that while we all benefit from each other's work, no one is dependent or "downstream" from anyone else. It's a shared commons in the public domain. There are a few principles that have helped make the unitedstates project something that's worth our time, which we've listed below.
Continue reading