Sunlight Labs

 

The 12 Days of APIs

IMG_1609‘Tis the season for application programming interfaces. Sunlight is in a festive mood. Not only are we hosting a pretty rad open house this week, but we have the perfect present for the open data developer in your life: a Sunlight Labs API key!

Here are our “12 days of APIs,” with a few bulk data sets thrown in to round it out. No singing required! Be sure to also check out some new additions and better accessibility we’ll have available in 2013.

12 minutes spent researching our API offerings on Sunlight Academy, which includes a brief tutorial video.

11 television markets reported more than 1,500 political ad filings this election. Download data about who bought more than $3 billion in political ads in 2012 from Political Ad Sleuth.

10 methods provided in the Sunlight Congress API. Our most popular API includes basic information on members of Congress, legislator IDs and lookups between places and the politicians that represent them.

9 political races had more than $20 million in outside spending this election. Download the bulk data on the money spent by super PACs, unions, corporations, nonprofits and other groups this cycle at Follow the Unlimited Money.

8 data sets covered by the Influence Explorer API (neé TransparencyData), which includes federal and state campaign contributions, federal lobbying, government grants and contracts, EPA violations, federal regulations and more.

7 collections presented in the Real Time Congress API. Get as close to real-time data as possible on bills, votes, amendments, videos, floor updates, committee hearings and documents.

6 standard arguments to query in the Capitol Words API. Search the Capitol Record since 1996 and filter your results by state, party, chamber, date, start date or end date.

5(0) states available in the Open States API, which also covers D.C. and Puerto Rico. Use the RESTful API or bulk download to access the only comprehensive collection of state legislative data in the U.S.

4 ways to get Political Party Time data. Use the JSON feed, CSV file, RSS feed or relational zip file to know when politicians are fundraising and who is hosting the events.

3 mobile apps powered by our APIs: Real Time Congress for iPhone, Congress for Android and OpenStates for iPhone and iPad. (And check out Call on Congress if you don’t have a smartphone.)

2 options to get Scout alerts, by email or via text message. Scout uses a variety of Sunlight APIs—Capitol Words, Real Time Congress and Open States—to deliver real-time policy alerts on state and national issues, as well as has special user option for developers.

And a listserv to follow what’s happening in Sunlight Labs.

Flickr photo of partridge in a pear tree light display by K. van Santen.

Check Out the Open States Beta Site

A view of the Open States beta site on the District of Columbia page.If you don't read the Sunlight Labs blog as religiously as I do (you should!), you might miss that our Open States project now has a public beta site up and running. Users can find who their state reps are, their voting record and contact information, the most recent actions taken by the legislature and search the full text of current and past bills. The OpenStates.org site is now the best place to find information on the activities in these 20 state legislatures: Alaska, Arizona, California, District of Columbia, Delaware, Florida, Hawaii, Idaho, Illinois, Louisiana, Maryland, Minnesota, Montana, North Carolina, New Hampshire, New Jersey, Ohio, Texas, Utah and Wisconsin with more coming soon.

For more information on the new site read my colleague James Turk's post here. Also be sure to play with the iOS app or, if you're a developer, use the API and join the Google Group. Our Scout project incorporates the Open States API to allow users to follow and search bills in all 50 states.

Please check out the beta site and let us know what you think. We are constantly adding features and are rapidly expanding the site to include remaining states. Thank you to the Rita Allen Foundation for their continued support of this project.

Tools for Transparency: Using Python for government transparency

Disclaimer: The opinions expressed by the guest blogger and those providing comments are theirs alone and do not reflect the opinions of the Sunlight Foundation or any employee thereof. Sunlight Foundation is not responsible for the accuracy of any of the information within the guest blog.

Earlier, our labs team introduced Python-Sunlight an open source project that will unify all Sunlight APIs and make it easier to use.  Today, Eric Davis a lead developer at the Nevada Policy Research Institute, a think tank located in Las Vegas has taken on using Python for government transparency and is here to share the steps with us. Eric specializes in using technology to make government more open and has developed and currently maintains TransparentNevada, Nevada Journal and TweetNevada.

Note: The code in is this post was written for Mac/Linux -- it'll run fine on Windows, but you'll need to make some adjustments around path names when using virtualenv and when placing your API key in your home directory.

The past few years have seen an explosion in the amount of publicly available government data. From White House visitor logs to House expenditures to electronic campaign finance data, there is an unprecedented amount of government data available. This information shines a light on how our government operates, while also requiring bigger and more powerful programs to help make sense of it all.

Compared to the average computer user, it’s likely that most transparency activists already possess an above-average level of computer literacy. We work with huge spreadsheets and massive databases on a daily basis. Yet as important and useful as these programs are, in each case we’re forced to work within the constraints of the program itself.

So what happens if you need to do something that can’t be accomplished with the programs you already use?

You create your own.

For the rest of this article, I’m going to focus on the Python programming language and why it is a good “tool for transparency.” Two features, in particular, make Python an excellent programming language: solid documentation and extensive libraries. When you’re first starting out, having access to well-written documentation can be the difference between “getting it” and “getting lost.” In addition, an extensive number of libraries — pre-written code to handle common tasks — are easily available, which helps you focus on the task at hand rather than re-inventing the wheel.

To help introduce Python, we’re going to write a mini-program that collects the names and Twitter accounts for all the members of Congress.

If you’re a complete beginner to programming, I recommend you read at least the first few chapters of Zed Shaw’s “Learn Python the Hard Way” before continuing. It’s a free, online book that will teach you the basics of running simple programs. At a bare minimum, you should be able to complete exercises 0 and 1. You should be comfortable with editing text in a text editor and running programs from the command line before moving on.

For this program, we’ll be using “Sunlight Labs Services,” a service provided by the Sunlight Foundation that enables programmers to access government data easily and efficiently. The first thing you’ll need to do is register for a key. Click that link and enter your name and email along with the place you work. For “intended usage” put something like “Grab Twitter accounts for members of Congress.” Once you receive your key via e-mail, open your text editor and copy it into a file called ‘.sunlight.key’ (note the leading period) in your home directory. With this key, you’ll be able to access all of the Sunlight Labs Services.

Next, download virtualenv.py into your home directory. Now open your terminal and type: python virtualenv.py learn-to-program.

After that, change into the ‘learn-to-program’ directory. Now, we’re going to install two libraries.  Type: ./bin/pip install sunlight tablib.

That’s it for the setup; now comes the fun stuff.

Open up your text editor and type in the following:

 

Save this file as twitter_accounts.py inside the ‘learn-to-program’ directory that was created earlier.

There are five “parts” of this program, each separated by an empty line:

Line 1:

Remember those libraries we installed with ./bin/pip? We import them so they can be used.

Line 3:

Now that we’ve imported the sunlight library, here’s how we’re going to use it. This creates a variable — lawmakers — that holds information on each lawmaker currently in Congress.

Line 5:

Just like we made use of the sunlight library above, now we’re going to use tablib — short for “tabular library” — here. This creates another variable, names_and_twitter that will hold lawmaker names in one column and twitter accounts in another. We also tell it that the data will have the headers “name” and “twitter.”

Lines 7-10:

Line 7 goes through each lawmaker in lawmakers. Lines 8-10 are run for each lawmaker in the lawmakers variable. First, it sets the name variable by combining the lawmaker’s first and last names with a space. Next, it sets the twitter variable to the lawmaker’s twitter_id. Finally, it appends the name and twitter variables to the names_and_twitter dataset for use in the next step.

Lines 12-14:

This creates a file — ‘twitter.xls’ — and tells Python we’re going to be writing binary data to it. The next line writes the data from the names_and_twitter variable as an Excel spreadsheet to the file. Finally, we close the file to tell python we’re done with it.

Now back in your terminal and from inside the ‘learn-to-program’ directory, type ./bin/python twitter_accounts.py. This tells python to run the code you just entered.

 

Assuming everything was typed correctly, you’ll now have a ‘twitter.xls’ file next to your ‘twitter_accounts.py’ file. Open ‘twitter.xls’ with Excel or OpenOffice and you’ll see the full name in column A and that lawmaker’s twitter account in column B.

Congratulations: You just created your first program!

If you don’t have Excel or OpenOffice or want to generate a CSV (comma separated value) file instead of an Excel spreadsheet, replace names_and_twitter.xls with names_and_twitter.csv in line 13 and change ‘twitter.xls’ to ‘twitter.csv’ in line 12. Whereas Excel files have to be opened with special programs, one nice thing about CSV files is they are plain text and can be used to copy data to various other systems – like databases -- quite easily.

So where do you go from here? Try adding features to what you just wrote. Include the lawmaker’s party next to his or her name. Add another column with the lawmaker’s phone number (don’t forget to update the dataset headers). Explore other parts of the Congress API to, for example, find all the lawmakers for a given zip code.

Have fun!

Interested in writing a guest blog for Sunlight? Email us at guestblog@sunlightfoundation.com

 

OpenGovernment Minnesota Launches Today

Residents of Minnesota now have a new way to keep track of what’s happening in their state with the launch of OpenGovernment Minnesota. The “land of 10,000 lakes” is the latest state added to OpenGovernment, a joint project of the Sunlight Foundation and the Participatory Politics Foundation, along with support from the Minnesota Historical Society.*

Visit MN.opengovernment.org to get the real story behind what's happening in government across the state via official government information, local news coverage, blog posts and social media alerts.

Writes David Moore on the OpenGovernment blog:

Now folks in Minnesota can track with ease everything their state legislature does — all the bills that are proposed, votes that are taken, money that was raised, and more. We’ve timed the launch of this, the sixth U.S. state on OpenGovernment, to coincide with the Netroots Nation conference ongoing this weekend in Minneapolis / St. Paul. We’re pleased to share this new public resource for accountability in government and citizen watchdogging with all the political bloggers & issue-based activists there.

The Sunlight Labs Open States project developed the legislative backend for OpenGovernment. Supported in part by the work of volunteers, the Open States project’s goal is to collect and scrape legislative data from all 50 state legislatures and make it available online in a unified, developer-friendly format.

* Update: Additional support from the Library of Congress National Digital Information Infrastructure and Preservation Program.

 

Announcing Sarah's Inbox

A screenshot of Sarah's Inbox, a project of the Sunlight Foundation.Today the Sunlight Foundation is proud to unveil Sarah's Inbox, our attempt to make Sarah Palin's recently released email records easier to use with a searchable function and an interface similar to Gmail. It builds on Elena's Inbox, our wildly popular project launched almost exactly one year ago that took the email data of Supreme Court justice Elena Kagan released by the Clinton Library and made it more accessible online.

Sarah's Inbox allows users to view the more than 14,000 emails from Sarah Palin's tenure as Governor of Alaska with familiar sorting functions. You can go page by page starting from the most recent emails or, most importantly, search. To help direct folks to interesting items, try some of our sample searches, star emails for later viewing or view the most starred emails by all users.

The project started after we were again approached by folks on Twitter and the Sunlight Labs list (join!) to take this ugly data and add the Sunlight secret sauce to make it user friendly. Initially we were cautious because the cast of characters who directly obtained the data included the likes of the New York Times, ProPublica, Mother Jones and MSNBC.com. We spoke with ProPublica and they encouraged us to take a stab at fashioning our own tool, so we borrowed their data and went to work. Sarah's Inbox would not be possible if not for the great people at Crivella West to gather, lift, scan and pay for all this data.

Like Elena's Inbox, Sarah's Inbox faced staggering issues of data quality because government officials continue to release digital files as hideous printouts requiring a laborious and error-ridden optical character recognition (OCR) pass over. You will notice that many of the emails are garbled, incomplete or contain odd characters - please keep in mind that we did the best with what we had and are not responsible for the content. Due to the programmatic nature of the tools used to build this site, we recommend checking any research effort against the source files.

Disclaimers aside, please enjoy Sarah's Inbox and tweet interesting items you find with #sarahsinbox.

Success Has Many Parents, Colleagues and Friends - Thank You!

This week marks the fifth anniversary of the Sunlight Foundation. It is exciting to reflect on how far we've come, the great people who helped us along the way and where we plan to go. With your help, we've grown from a small organization with big ideas to a connected community whose call for greater government openness and transparency is heard throughout the country.

We began with the nonpartisan goal of using the revolutionary power of the Internet and new technology to open government information. When we started, this modern interpretation of transparency was almost a completely foreign idea in Washington - a place where corrupt lobbyist Jack Abramoff dominated the headlines and sifting through reams of paper in order to get at the truth of what was going on was the status quo. While ordinary citizens were embracing new media tools and websites that gave them a readily available stream of information at their fingertips, government showed little interest in keeping up with the times.

Right out of the gate, we took on these age-old issues with a fresh arsenal of online tools and empowered citizens to engage in new forms of direct oversight. We believed then, as we still do, that none of us are as smart as all of us and that we have a stronger democracy when open information gives people the ammunition they need so they can speak truth to power. Sunlight developed all kinds of new tools and websites to achieve these goals and get the public involved in the political process.

We encouraged lawmakers to post their schedules online and launched the Open House Project to engage policy experts, citizens and lawmakers in a conversation on all the ways the House of Representatives could update how it shares information with the public. We initiated and funded dozens of projects to create online databases of government information. Everything from earmarks to congressional fundraiser invitations to foreign lobbying disclosures to House expenditures. We created mobile applications to put Congress into the hands of the people and fostered a community of thousands of 'civic hackers' to build better tools. We updated legislative rules and collaboratively wrote new policy to open government to the Internet age. We've trained thousands of journalists and citizens in using data and in using the web to watchdog Washington. We modeled government websites to show what is possible and followed the money, lobbying and the influence industry with ongoing reporting projects.

Through it all we are most inspired and proud of the people who take action and participate in the process to improve our democracy. Thank you to the countless people and organizations who have worked with us, used our tools, dug deep into our websites through our first five years. The Sunlight Foundation will continue to work with you explore how to enhance our democracy and citizen engagement with our public officials using online tools. Sure, there's a lot more to be done. As a wise person once said, if this was easy, it would have been done already. And we promise you - the best is yet to come!

Please continue to support our work to keep the light shining on government.

Sunlight Live's State of the Union Coverage

Last night the Sunlight Foundation's award-winning Sunlight Live platform covered the State of the Union with context and fact-checking from Sunlight's Reporting Group and teams from the Huffington Post, National Journal, CQ Roll Call and the Center for Public Integrity. It was an exciting evening and we're honored so many of you chose to join our coverage for Obama's speech, the Republican response and the Tea Party followup.

More than 10,000 people tuned in and, while at times we were nearly technically overwhelmed by the response, our talented Sunlight Labs team held us together. The engaged viewers left over 1,000 comments and we published more than a third of them to be answered by the reporters or shared with other visitors. Hundreds of folks camped out on the site hours before the speech, indicating their preference to watch on our channel. As best as we can measure, 2,308 tweets and 908 shares on Facebook sent fans to Sunlight Live.

Here are some excerpts from the various news coverage our Sunlight Live project received:

Roshan Nebhrajani from Medill's School of Journalism joined Sunlight in our office and reported on the experience:

A group of 14 reporters gathered at the Sunlight Foundation, all centrally connected by one crucial link — a heavy-duty extension cord — as they typed through dinner to provide interactive coverage of the address to nearly 2,000 viewers. [...] visitors to the Sunlight Foundation’s site engaged in online conversation. One said: “This is perfect. Like sitting in the room, watching with a bunch of smart, informed people.”

GOOD Magazine initially promoted the White House live-stream online but switched to support Sunlight Live after learning the extent of our coverage:

While we indeed support the government smartly using technological advancements to spread information, in this case, we're going to direct you away from the White House's stream and toward the Sunlight Foundation's live blog. Not only will Sunlight be streaming video of the address, reporters from CQ Roll Call, the National Journal, the Center for Public Integrity, and the Huffington Post will be on hand to fact-check and offer context as the president speaks. We can almost guarantee that the information provided will be more objective and less dry than what the White House is offering. Happy viewing.

Fast Company did a roundup of all the various ways to watch the State of the Union and highlighted the collaborative and real-time reporting during Sunlight Live:

Traditionally, we’ve had to wait for the networks’ post-game shows before anyone starts to dissect the accuracy of various statements made by the president or the opposition. But last night, the Sunlight Foundation—in partnership with The Huffington Post, National Journal, CQ Roll Call, and the Center for Public Integrity—posted real-time fact-checking during the course of the addresses.

MediaBistro has an article about the new dawn of real-time fact checking that points to the work of the Sunlight Foundation and the Sunlight Live event:

Gone are the days when political junkies would have to wait for a speech to be over before talking heads could endlessly parse each word. [...] with our incredible shrinking news cycle and the rise of participatory journalism, the approach only makes sense.

It was a great team effort at Sunlight and we loved working with our partners from the Center for Public Integrity, National Journal, CQ Roll Call and the Huffington Post. Thank you to everyone who helped make this Sunlight Live event a success and we hope you join us for future coverage.

Photo by Nicko Margolies

Better Draw a District - Doodle Your Own Gerrymander

High school civics classes teach that democracy is in the hands of voters. This view, though empowering, only tells part of the story. To really understand a democracy, you need to understand how votes are counted. One must shed light on the very machinery that powers our representative democracy: the sometimes bizarrely-shaped geographic boundaries called congressional districts.

Read more

Sunlight's Checking Influence: Find the Politics in Your Pocketbook

The Sunlight Foundation is proud to announce our Checking Influence tool that gives individuals the power to see the political expenditures of the businesses you frequent. The simple bookmarklet allows users to connect personal spending habits seen on your online bank or credit card statement with the lobbying and political contributions of companies.

As we start to examine how much we spent on Black Friday or Cyber Monday, Checking Influence will let all of us see how effortlessly politics escapes Washington and settles its way into our wallets; often without us even knowing it. We created this tool to help Americans be more informed consumers and citizens. Just as some consumers check to see if their coffee is free trade or if their clothing is manufactured in sweatshops, they can now know if their purchases help fund lobbying campaigns. We’re trying to answer the question: When you buy coffee at Starbucks; refill a prescription at Walgreens; or download a song from iTunes, do you know where your money really goes?

How to Use Checking Influence

Using Checking Influence is simple and secure. First, add the Checking Influence bookmarklet to your browser’s toolbar. Next, go to any web page that shows your spending transactions, such as a banking site, your credit card statement or Mint.com. Then, just click on the Checking Influence bookmarklet, and it will find the company names on the transactions list and show you the “influence data” for the corporations it can identify -- including political campaign contributions and what lobbying the corporation conducted.

Behind the Curtain

The backbone of Checking Influence is TransparencyData.com, Sunlight’s open-source, central repository for federal lobbyist registrations, federal grant and federal and state campaign contributions. Sunlight Labs, the Foundation's in-house team that builds technology and Internet tools to make government more transparent and accountable, developed Checking Influence. The site is built upon the public Transparency Data API, whose data is provided by the Center for Responsive Politics and the National Institute for Money in State Politics.

A Note About Security

We understand that everyone is cautious about banking information online (and rightfully so!), which is why the Sunlight Foundation has taken a number of steps to ensure that Checking Influence is safe to use. Checking Influence uses the same industry-standard SSL encryption that your banking site does to keep your financial information secure and we don't save any personally identifiable information. The tool is simply searching bank statements for transactions with company names that match information from TransparencyData.com.

Please contact us with any feedback and we hope you enjoy playing around with this new tool!

Using Technology To Assist Declassification

by Sunlight Foundation policy intern Melanie Buck

The role technology can play in streamlining the declassification process was the topic of a Public Interest Declassification Board meeting on Thursday, Sept. 23. The PIDB is an congressionally-established advisory committee that works to facilitate public access to national security-related records. It is considering how to advise agencies on their efforts to declassify approximately 410 million pages of records by December 2013. An agenda for the meeting is available here [PDF].

The Board heard presentations on the feasibility of using an automated computer systems to streamline document review. The speakers were Jeff Jonas from IBM, Tom Lee from the Sunlight Foundation, and John Verdi from the Electronic Privacy Information Center.

Jeff Jonas outlined a hypothetical automated system that would tag documents based on key words and phrases to make predictions about whether a document should be declassified. A high level of accuracy would come from training the system on documents that have already been reviewed, combined with determining how the new document relates to the old information, in a process known as "context accumulation." He analogizes context accumulation to solving a jigsaw puzzle in this blogpost. The technology already exists, but would take some time to implement.

Tom Lee described the requirements for determining whether a document should be declassified, focusing on how a computer system could help prioritize the work queues of reviewers. For example, pages that the system determines most likely to be sensitive can be reviewed first, and if the system is determined to have made a correct judgment, the rest of the document (and potentially the document series) can be removed from the work queue. A search algorithm could be trained to return sophisticated results that would give human reviewers a clear indication of the content of a given document. Such a system would involve a static up-front cost, with additional computational costs varying on the system's operational speed.

John Verdi approached the issue from a policy perspective, explaining his evaluation of what transparency groups and the public want from a declassification process. He suggested that the preparation of unclassified summaries adds work without adding much public value and is an unnecessary burden on the declassification process. In his opinion, resources would be better spent reviewing entire documents and declassifying whenever possible. He also discussed a few transparency tools that many hope to see, such as a large, openly-accessible searchable database of declassified records.

The Board has another meeting scheduled for Nov. 9, 2010 to further investigate ways to facilitate declassification.

Declassification has been a current focus in Congress as legislators promote a cultural shift from “need to know” to “need to share.” Just last week, Congress sent the "Reducing Over-Classification Act," H.R. 553, to the President for his signature. Among other things, the legislation requires the Director of National Intelligence to establish policies and procedures to identify the classification of portions of information within an intelligence product, hopefully thereby facilitating automated review.

We summarized a July 22 discussion of the declassification of historical congressional records here.