Do-it-yourself Data


Parke Wilde, writing at the U.S. Food Policy blog, has a pretty good idea: take data from different sources, line it up and organize it by congressional district, and then present it–either graphically (a map) or in a table, for easy analysis–to find out what individual members are up to. I’ll return to this in a minute–and it’s an intriguing notion that fits in with something I’ve been kicking around in my head for a while–but first let’s look at what Wilde did: he looked at campaign contributions from C-Span, farm subsidy payments from the Environmental Working Group and earmarked pork projects from Citizens Against Government Waste all in a single disctrict — that of Rep. Tom Latham, R-Iowa, a member of the House Agriculture Appropriations Subcommittee.

He found lots of PAC donations from agricultural interests, lots of earmarks for agricultural programs and lots of farm subsidies flowing int Latham’s district. Wilde then asks, “I would welcome it if somebody can figure out a more systematic way to automate the cross-linkages between these information sources, to find the most egregious examples of the subsidy – earmark – donations political nexus.”

And that got me thinking. Right now we have a fun feature called Congress in 30 Seconds that lets users make their own videos–drop in music, text and images to make a thirty second spot. (I’m going to get around to doing one myself this weekend.) So, is there a way you could do the same thing with data? Choose, let’s say, House Agriculture Appropriations Subcommittee members in one column, contributions from Agriculture interests (or even, if you like, meat producers as a subset of agriculture) to those members, bills sponsored by them, and so on, and end up with your own table to post right there on your blog. Or your could work backwards from, let’s say, recipients of earmarks to lobbyists hired by those recipients to lobbyists who are former staffers of members to the members themselves.

What stands in our way is probably the way that this data is coded–these disparate sets of data don’t talk to one another–but that’s something that ideally we’d like to overcome.

Something to keep thinking about…