What you can do with TransparencyData.com


Sunlight Labs announced the release of TransparencyData.com earlier today. I spent some time playing around with the site yesterday and have to say that it completely changes the ways in which researchers tracking campaign finance issues will get their data. The site makes searching, obtaining and downloading data so much easier than it has ever been.

Labs Director Clay Johnson has been tweeting examples of what kind of data you can find through the TransparencyData.com database. (Follow the links.) Here’s another example:

On Monday, an explosion at the Upper Big Branch Mine in West Virginia killed 25 miners and trapped four others. In 2009, the mine received hundreds of violations from the Mine Safety and Health Administration (MSHA), many of them very serious. The mine is owned by Massey Energy, which itself is owned by the politically powerful Don Blankenship. Blankenship is facing harsh criticism for his apparent indifference to MSHA violations. This has led many to look at his political influence in West Virginia, particularly in how he has tried to influence lawmakers and judicial races. With the help of TransparencyData.com, we can easily look up the contributions made by Blankenship, the employees of Massey Energy and the Massey Energy political action committee.

What we see here are the 364 contributions made by individuals listing Massey Energy as their place of employment from 2003-2010. The majority of these contributions come from Blankenship or the company’s political action committee. The vast majority of these contributions were made in state-level races in West Virginia–legislative, gubernatorial and judicial races. You can even see the Win-Loss ratio that Massey Energy has on the candidates that received contributions from them.

I’m not going to pretend to know very much about West Virginia politics, but I can say that anyone writing a story about Blankenship’s influence in West Virginia could quickly obtain the necessary contribution data through TransparencyData.com in seconds to begin or enrich their research. Just a cursory look over this allows a user to quickly see the contributions Blankenship made to his independent political committee, And For The Sake Of Kids.

And For The Sake Of Kids ran a campaign, funded with nearly $2.5 million in Blankenship’s money, to unseat a West Virginia state Supreme Court of Appeals judge, who Blankenship feared would rule against Massey Energy in a number of appeals that were on the docket for the court. Blankenship’s campaign worked and he installed a sympathetic ear onto the court. That sympathetic ear went on to rule in favor of Blankenship’s appeal. The money worked. (The Supreme Court of the United States would later rule that the sympathetic ear–Judge Brent Benjamin–would have to recuse himself from certain cases due to the existence of “actual bias” due to the spending by Blankenship on his election.)

Another example that little bit of research uncovered was the revelation that Massey Energy actually owns a seat in the state legislature. State legislator Troy Andes (WV-14) works for Massey Energy in their Public/Community Relations department. Massey Energy employees spent $8,700 to help elect and re-elect Andes in 2006 and 2008.

I’m sure someone with more knowledge of West Virginia politics could actually dig further into this data. Or any other data you’d like to. Now that TransparencyData.com exists there are a whole host of new, incredibly fast queries to be done on campaign finance data from the state to federal level.

Categorized in:
Share This:
  • Hi Earl,

    Thanks for the feedback. Regarding your issue with weird characters in the header and footer, the problem was in the file type that Internet Explorer assigned to the download. The compressed file is a tar zip, and so should have a .tgz extension, not .gz. The bulk downloads page has now been fixed, so if you redownload the file, or just change the extension on the one you have, and run gunzip again, you should get a plain old CSV.

    Regarding searching for a particular seat, the filter you want is ‘Office’. This lets you choose between Presidential, Senate, House, State Upper and Lower Chambers, and Gubernatorial races.

    We will be working on getting more documentation about the data up. In the mean time you might want to check out followthemoney.org, our data provider for state level contributions.


  • Comments from my first visit:

    Maybe I missed it, but there’s no place on the http://transparencydata.com/ site to make comments or give feedback. What about a wiki so info could be added by others about what is learned?

    Wouldn’t the use of the more common .zip format instead of .gz make the data available to a wider audience? Perhaps .gz is a bit more optimized, but IMHO you’re restricting the user audience.

    Linux and Macs don’t care about file extensions, but file extensions might be useful for Windows users. (I switch between Linux and Windows all the time for various analysis tools.)

    I like to work with bulk data, so it’s great that you’re offering that option. I took a look at the contributions.fec.yyyy.gz and contributions.nimsp.yyyy.gz files with yyyy=2006, 2008 and 2010.

    I don’t think your info about these bulk files is adequate (or did I miss this explanation somewhere?). After expanding the files using gunzip in Linux, the resulting files seems to have a header and a trailer record and a comma-separated data.

    The bulk header has the name of the file, followed by a number of hex 00s, followed by some unknown numeric data, followed by more hex 00s, …., and ends with the comma-separated column names. I’d like to use R to analyze this data (among other tools), and this unusual header will make processing a bit more of a pain.

    The last line of the bulk file appears to be a large number (perhaps several hundred) hex 00s.

    These headers and trailers records are making the data harder to access by those who might be comfortable with many analysis tools.

    I manually modified the header and deleted the trailer record using a file editor. The resulting data was easy to load into a Microsoft Access database. (I used all text fields in Access, except for the amount field.)

    The Contributions info on the Documentation tab seems to be a good description of the data.

    I ran a query in Access for all the districts with a “ks” prefix and found the contributions.nimsp.2010.gz file only has data for the Kansas Senate (seat = “state:upper”) but there are some likely incorrect seat = “state:office” values that should have been “state:upper”). I think district = “KS-10” with seat = “state:office” is incorrect for the 10th Senate district.

    Similar data is now online from the Kansas Governmental Ethics Commission for 2009. I’ll eventually compare the 684 “ks” district records from your file with what our Ethics Commission says.

    You need transparency about this transparency data. For example, somewhere you should say in the contributions.nimsp.2010.gz file you only have Kansas Senate data and no Kansas House data. A wiki with a state page?

    Our Senate is only elected every four years, so the Kansas Senate data is not very relevant right now since they are not elected until 2012. There does not appear to be any seat = “state:lower” data for the Kansas House, which would be much more useful since they are elected every two years.

    FWIW, in the online selection of filters, I don’t see a way to find data for a particular seat (say seat = “state:upper”), or a particular district. Could you consider a filter for all data times?

    Thanks for all the great data!

  • May I reprint your article (no changes)? I am fascinated and extremely excited by this latest venture of yours. TYTYTYTYTYTY! Now I want my contacts to know about it too.

  • Alise

    Where does the data you use come from?

  • I did a little browsing and came up with a record of a donation last year by the Democratic Congressional Campaign Committee to Michele Bachmann.

    I can’t imagine this to be true. Makes me wonder….

    Federal 2009-03-31 $6,577.00 Democratic Congressional Campaign Cmte Michele Bachmann (R)

    • So there is an explanation here.

      The DCCC makes independent expenditures both FOR and AGAINST candidates for office. In this case, this spending is AGAINST Michele Bachmann. There is a search filter on TransparencyData.com that is titled “For/Against Candidate.” If you select “For Candidate,” only contributions made to or on behalf of that candidate will appear–in the case of the search you did, no contributions would show up from the DCCC to Bachmann.

      It’s a little confusing right now, but we’re working on a way to make it clearer soon.

      (If you thought that search that you did was surprising, try searching with “Contributor”: Republican National Committee and “Recipient”: Obama, Barack with no “For/Against Candidate” filter. That will throw you for a loop.)