On Monday the House of Representatives delivered, as promised, an electronic dump of House Expense Reports. We, at Sunlight Labs had a plan. We knew it was going to be a huge PDF, but we have all the infrastructure in place. We had plenty of bandwidth, knew when the data was coming out, roughly how it was going to look, and that it was likely we wouldn't be able to parse it all with computers. "We'll use TransparencyCorps," we thought, to get that last mile out of the data, so that eventually we'll end up with a parseable database.
Continue reading