It’s August and that means soon it’ll be back to Thrive long day care for students and teachers. Even if... View Article
Continue readingWhat Redistricting Means For Sunlight’s APIs
With the election 4 months away we're starting to get questions about when our various APIs and projects that depend upon them will return the newly redistricted legislative boundaries.
The short answer is that we will most likely not be able to support new district boundaries until after the November 2012 election (but before they technically go into effect for the purposes of representation in 2013).
If you're interested in the reasoning behind this decision, read on, but be warned, redistricting is a lot more complex than "the state has released new boundaries, let's load them."
Continue readingTools for Transparency: Using Python for government transparency
Disclaimer: The opinions expressed by the guest blogger and those providing comments are theirs alone and do not reflect the... View Article
Continue readingGovernment: Do You Really Need An API?
As the term "API" has become more widely recognized through its ubiquity in social media and other web services, its coolness factor has grown considerably, and has become something frequently called for from government.
But does government really need to rush around and make APIs for all of their stuff? Peter Krantz argues that offering direct downloads to bulk data is a much more scalable, simple, and sane solution in most cases.
You should go read his article, rather than just our summary. But specifically...
Continue readingFollow State Legislatures with the New Open States iOS App
Today the Sunlight Foundation launches our Open States iPhone and iPad app that puts the inner-workings of state legislatures in the palm of your hand.
Continue readingRegulations.gov Gets an API & More
Sunlight has been interested in the federal rule-making process for quite a while: we sponsored the app contest that lead to the current incarnation of federalregister.gov, which lists federal regulations as they are published, and kick-started an effort to map regulations to the laws that authorize them during a hackathon late last year. We also have extensive experience in the analysis of corporate influence on the political process, having launched several prominent influence-related projects under the Influence Explorer banner. During the last year, we’ve begun to examine the confluence of these two interest areas: corporate influence on the regulatory process, and, in particular, the comments individuals and corporations can file with federal agencies about proposed federal regulations. The first glimpses of the results of this effort went live on Influence Explorer last fall, with the addition of regulatory comment summaries to corporations’ profile pages.
Given this history, we’ve been excited to explore this week’s relaunch of regulations.gov, the federal government’s primary repository of regulatory comments, and the source of the data that powers our aforementioned Influence Explorer regulatory content. This new release brings with it a much-needed visual spruce-up, as well as improved navigation and documentation to help new users find and follow regulatory content, and a suite of social media offerings that have the potential to expose rule-making to new audiences. There have also been some improvements to document metadata, such as the addition of category information visitors can use to filter searches by industry, or browse rule-makings topically from the homepage.
Of more interest to us as web developers is the addition, for the first time, of official APIs to allow programmatic access to regulatory data. It’s clear that the regulations.gov team has taken note of current best practices with respect to open data APIs, and have produced clean, RESTful endpoints that allow straightforward access to what is, especially for a first release, a reasonably comprehensive subset of the data made available through the general end-user web interface. While we have been successful in performing significant regulatory analysis absent these tools, our work required substantial effort in screen-scraping and reverse engineering, and we expect that other organizations hoping to engage in regulatory comment analysis will now be able to do so without the level of technical investment we’ve had to make.
Of course, there is still work to be done. Much of the work we’ve done so far on regulations, and that we hope still to do, revolves around analysis of the actual text of the comments posted to regulations.gov (which can take the form of PDFs and other not-easily-machine-readable documents), and depends on being able to aggregate results over the entirety of the data, or at least significant subsets of it. As a result, even with these new APIs, we’ll still need to make large numbers of requests to identify new documents, enumerate all of the downloadable attachments for each one, download these attachments one at a time, and maintain all of the machinery necessary to do our own extraction of text from them. While we’re fortunate to have the resources to do this ourselves, and have made headway in making the fruits of our labors available for the public, it would certainly behoove the regulations.gov team to move forward with bulk data offerings of their own. Sunlight has a long history of advocating the release of bulk data in addition to (and perhaps even before) APIs, and the regulatory field illustrates many of our typical arguments for that position; the kinds of questions that can be answered with all of the data are fundamentally different than those that can be answered with any individual piece. We recognize that offering all of the PDFs, Word documents, etc., to the public might be cost-prohibitive from a bandwidth point of view, but regulations.gov is doing text extraction of their own (it powers the full-text search capabilities that the site provides), and offering bulk access to the extracted text as we have done could provide a happy medium that would facilitate many applications and analyses without breaking the bandwidth bank.
In general, we see plenty of reasons to applaud this release and the team at EPA that's behind it. While many of its changes are cosmetic and additional improvements will be necessary for regulations.gov to reach its full potential, this update promises further progress that will benefit developers and members of the public alike. We share the enthusiasm of the regulations.gov team for increasing access to and awareness of these crucial artifacts of the democratic process, and look forward to engaging with them and the broader open government community as they continue to improve this public resource.
Continue readingDon’t Use Zip Codes Unless You Have To
Many of us in the labs found it thrilling to watch the internet community unite around opposition to the SOPA and PIPA bills yesterday. Even more gratifying was seeing how many participating websites used our APIs to help visitors find their elected representatives. This kind of use is exactly why we built those tools, and why we'll always make them freely available to anyone who wants to make government more accessible to its citizens.
Still, I'd be lying if I said we don't occasionally wince when we see someone using our services in a less-than-ideal way. It's completely understandable, mind you: the problem of figuring out who represents a given citizen is tougher than you might think. But we hate to think that anyone is getting bad information about which office to call -- talking to the people who represent you should be simple and easy! Since this comes up with some frequency, it's probably worth talking about the nature of these problems and how to avoid them.
TL;DR: Looking up congressional districts by zip code is inherently problematic. Our latitude/longitude-based API methods are much more accurate, and should be used whenever possible.
The first complication is probably obvious: zip codes and congressional districts aren't the same thing. A zip code can span more than one district (or even more than one state!), so if you want to support zip lookups for your users, you'll have to support cases where more than one matching district is returned. Our API accounts for this, but it's important that your code do so, too. We err on the side of returning inclusive results when a zip might belong to multiple congressional districts.
Unfortunately, things are actually more complicated than that. Most people don't realize it, but zip codes describe postal delivery routes -- the actual routes that mail carriers travel -- not geographically bounded areas. Zip codes are lines, in other words, while congressional districts are polygons. This means that mapping zips to congressional districts is an inherently imperfect process. The government uses something called a zip code tabulation area (ZCTA) to approximate the geographic footprint of a given zip as a polygon, and this is what we use to map zip codes to congressional districts. But it really is just an approximation -- it's far from perfect.
It's much better to skip the zip code step entirely and simply look up your location against the congressional district shapefiles published by the Census Bureau using a precise geographic coordinate pair instead of a hazy, vague zip code. Thanks to the Chicago Tribune News App Team's excellent Boundary Service project, we offer exactly this capability. If you can, we strongly encourage you to get a precise latitude/longitude pair from your users (either by geolocating them or geocoding their full address), then use it to determine their representatives.
"But what about house.gov's ZIP+4 congressional lookup tool?" I hear you asking. It's true, many House offices use this tool to determine who your representative is (and whether you're allowed to email them). Unfortunately, just because this tool is on an official site doesn't mean it's perfect. Here in the Labs, Kaitlin (who lives in Maryland) can't write her representative because the ZIP+4 tool gives incorrect results. Besides, not that many people know their full nine-digit ZIP+4 code.
So if you can, use latitude/longitude pairs. If you can't, and have to depend on zips, we'll supply results that are very, very good -- but not as good as real coordinates would allow.
Continue readingAnnouncing the Return of “Capitol Words”
More than three years ago, we launched a website called Capitol Words that gave an at-a-glance view of what word was most popular in Congress. Today, the Sunlight Foundation is unveiling the completely revamped and rewritten Capitol Words.
Continue readingTools for Transparency: Better ways to contact members of Congress
A lot of Americans are trying to make their voice heard in the debt ceiling negotiations. So many, in fact,... View Article
Continue readingThe Real Time Congress API
Today we're making available the Real Time Congress API, a service we've been working on for several months, and will be continuing to expand.
The Real Time Congress API (RTC) is a RESTful API over the artifacts of Congress, kept up to date in as close to real time as possible. It consists of several live feeds of data, available in JSON or XML. These feeds are filterable and sortable and sliceable in all sorts of different ways, and you can read the docs to see how.
RTC replaces and deprecates the Drumbone API, which is no longer recommended for use.
Continue reading