Sunlight Foundation

Announcing Sarah's Inbox

A screenshot of Sarah's Inbox, a project of the Sunlight Foundation.Today the Sunlight Foundation is proud to unveil Sarah's Inbox, our attempt to make Sarah Palin's recently released email records easier to use with a searchable function and an interface similar to Gmail. It builds on Elena's Inbox, our wildly popular project launched almost exactly one year ago that took the email data of Supreme Court justice Elena Kagan released by the Clinton Library and made it more accessible online.

Sarah's Inbox allows users to view the more than 14,000 emails from Sarah Palin's tenure as Governor of Alaska with familiar sorting functions. You can go page by page starting from the most recent emails or, most importantly, search. To help direct folks to interesting items, try some of our sample searches, star emails for later viewing or view the most starred emails by all users.

The project started after we were again approached by folks on Twitter and the Sunlight Labs list (join!) to take this ugly data and add the Sunlight secret sauce to make it user friendly. Initially we were cautious because the cast of characters who directly obtained the data included the likes of the New York Times, ProPublica, Mother Jones and MSNBC.com. We spoke with ProPublica and they encouraged us to take a stab at fashioning our own tool, so we borrowed their data and went to work. Sarah's Inbox would not be possible if not for the great people at Crivella West to gather, lift, scan and pay for all this data.

Like Elena's Inbox, Sarah's Inbox faced staggering issues of data quality because government officials continue to release digital files as hideous printouts requiring a laborious and error-ridden optical character recognition (OCR) pass over. You will notice that many of the emails are garbled, incomplete or contain odd characters - please keep in mind that we did the best with what we had and are not responsible for the content. Due to the programmatic nature of the tools used to build this site, we recommend checking any research effort against the source files.

Disclaimers aside, please enjoy Sarah's Inbox and tweet interesting items you find with #sarahsinbox.

Is AFSCME or the Chamber the top political spender?

The Wall Street Journal brings an apple to the orange convention, writing that, "The American Federation of State, County and Municipal Employees is now the biggest outside spender of the 2010 elections, thanks to an 11th-hour effort to boost Democrats that has vaulted the public-sector union ahead of the U.S. Chamber of Commerce, the AFL-CIO and a flock of new Republican groups in campaign spending."

They may well end up being the top spender, but our data currently puts them at number 8, having spent a more modest $9.6 million--significantly less than the $87.5 million the Journal reports. The New York Times, meanwhile, reports that the top spender among non-party committees is the U.S. Chamber of Commerce at $21.1 million--very close to our own figure of $23.6 million (which, in fairness to the Times, is a constantly rising number). Why the discrepancy between the Journal's figures and those that Times and we put out?

The Journal got its $87.5 million figure directly from AFSCME, and also got political spending totals from the U.S. Chamber of Commerce, Service Employees International Union and other groups. The Sunlight Foundation is totaling spending reported to the Federal Election Commission.

The disparities between what groups say they are planning to spend and what they've reported spending are troubling, to say the least. It's one of the reasons that we in the Reporting Group use formulations like "U.S. Chamber of Commerce reports spending $29.2 million on lobbying (which includes a wide range of political activities) in the third quarter of 2010," rather than saying "the Chamber of Commerce spent..."

Some time in 2011, we'll get more complete annual reports from the labor unions which dislcose their political spending, forms 990 from groups like the U.S. Chamber of Commerce, year end reports from 527 organizations, and from this information we will begin to piece together how much was spent on the mid-term elections. Even then, it can be daunting.

Lets take a look at one organization: The U.S. Chamber of Commerce disclosed, in the 2008 form 990 it filed with the Internal Revenue Service, spending $23 million of election-related spending and another $4.7 million lobbying. It reported spending $16.5 on electioneering communications to the Federal Election Commission. The Chamber also disclosed to the House and Senate in 2008 that it spent $62 million on lobbying--defined as influencing legislation; participating in any political campaign, including state and local races; attempting to influence the public to on political matters or elections as well as contacts with certain high ranking executive branch officials. Its affiliates spent about $31 million more. Which is the right number?

It might seem like we are mixing apples, oranges and xylophones here (FEC, IRS and lobbying disclosures), but remember, we're trying to get a handle on who spends the most on political activity. The IRS lobbying definition the Chamber uses when it files lobbying reports with Congress doesn't produce a number that matches the two, separate numbers it reports to Internal Revenue Service or those two numbers added to the FEC number--16.5 + 4.7 + 23 does not equal 62 (being a journalist, I've counted it out on my fingers twice to make sure). This doesn't mean that any of these individual numbers is wrong or inaccurate (although I suspect all of them are to some degree)--just that they report different things.

In the case of the Chamber, the lobbying disclosure form filed with the House and Senate comes closest to capturing total spending but offers no itemization. For labor unions, it's the annual report--form LM2--they file with the Labor Department, which unlike lobbying disclosure forms actually itemize expenses.

So while one should treat the Wall Street Journal's numbers with a bit of skepticism--consider the source--it's not crazy to ask these groups what they're spending. I'm not sure I'd be comfortable saying that AFSCME has leapt into first place on their say-so, but one should also be careful to recognize that disclosure from the Federal Election Commission is by no means all-inclusive. For example, no U.S. Chamber of Commerce ad that ran more than 30 days before a primary or more than 60 days before the general election had to be disclosed anywhere--except to the local TV, radio or cable operator that ran it. But that's because Federal Communications Commission rules require that disclosure, not federal election law.

And to answer the question in the headline honestly, we'd have to say that at this point we just don't know. According to the FEC, it's the Chamber. According the groups themselves, it's AFSCME. All we do know is what they are required to report to the FEC, which is the only tool we have right now for tracking outside spending.

Sunlight Live Recap: How We Did It

During the Health Care Summit on Thursday, Feb 25, Sunlight tried something new by connecting a live political event to the government data and information we work to make more accessible every day.

Dubbed "Sunlight Live," our coverage of the joint Republican and Democratic heath care summit as a pilot was a smashing success, thanks to all of you.

Read more

New York Times' Represent Feature

The New York Times just launched a new interactive feature called Represent. Represent allows New York City residents to type in their address and receive a stream of political information for all of their elected representatives from the City Council to the U.S. Senate. The information currently contained in Represent includes mentions in Times articles and congressional votes. It's very much like a political coverage EveryBlock (and it wouldn't be a bad idea for EveryBlock to integrate this data into their local data streams). The Open blog at the Times explains:

Using your address as a starting point, Represent figures out which political districts you live in and who represents you at different levels of government. It draws maps that show how where you live fits into the political geography of the city. And using information collected from around the Web, it presents a customized activity stream that tracks what the people who represent you are doing.

Represent crawls a collection of New York Times stories and City Room blog posts, looking for references to public officials. It also draws from official data sources — currently, Congressional roll-call votes, which we collect by parsing feeds and scraping government Web sites. It evaluates each article, blog post and vote to find the stories most relevant to you. (Both our article search and our Congressional votes database will soon be available to outside developers through free, open APIs.)

The fact that the Times is launching something that serves not just as a supplement to coverage, but also as a public service, shows the direction that large, traditional media sources are heading as they shrink in print and expand online. Another example would be the Washington Post's congressional votes database. The Post also currently experiments with Apture to provide greater context in their political coverage. Here's Apture explaining their partnership with the Post: I can only imagine that we'll be seeing a lot more information integration from large traditional news organizations in the coming years.

New York Times Opens Archives Online

Update: For some reason it appears the Times has pulled this awesome research tool. I'll try to find out why.

The New York Times launched an amazing research tool, creating a great online browser for all their content from 1851-1922. The Times is also offering the data in API so that, if you can, you can create your own browser. The Times blog says:

"As part of eliminating TimeSelect, The New York Times has decided to make all the public domain articles from 1851-1922 available free of charge. These articles are all in the form of images scanned from the original paper. In fact from 1851-1980, all 11 million articles are available as images in PDF format. To generate a PDF version of the article takes quite a bit of work — each article is actually composed of numerous smaller TIFF images that need to be scaled and glued together in a coherent fashion."

If you do research - or are in any way in need of scanning the 1855 adverts for local New York haberdashers - this is not to be missed. Check out the TimesMachine. (There might be some kind of server problems right now.)

The article to the left references a large scale congressional investigation into lobbyist actions in an attempt to block President Woodrow Wilson's tariff bill, a key element of his New Freedom agenda. The investigation sought to discover if Senators had been bribed or received undue influence from these lobbyists and ultimately required every sitting Senator to testify to their personal finances, campaign contritbutions, and relationships with lobbyists and other company agents. This amounted to the first full disclosure by members of Congress in regards to the personal finances, their campaign contributors, and the nature of the lobby. A first for transparency in Congress.

Read more