It's been a while since recovery.gov was in the headlines. ARRA money continues to go out the door, but it's safe to say the program is winding down. The Administration has been taking a quiet victory lap, including this charming video, in which Vice President Biden calls up an ice creamery in Santa Cruz that got off the ground thanks to a Recovery Act loan:
Here's a crazy idea: why don't we look up this loan on recovery.gov and see what else we can discover about it?Continue reading
Today we're launching Clearspending -- a site devoted to our analysis of the data behind USASpending.gov. Ellen's already written about this project over on the main foundation blog, and you should certainly check out her post. But I wanted to talk about it a little bit here, too, because this project is near & dear to my heart, having grown out of work that Kaitlin, Kevin and I did together before I stepped into the role of Labs Director.
The three of us had been working with the USASpending database for a while, and in the course of that work we began to realize some discouraging things. The data clearly had some problems. We did some research and wrote some tests to quantify those problems -- that effort turned into Clearspending. The results were unequivocal: the data was bad -- really bad. Unusably bad, in fact. As things currently stand, USASpending.gov really can't be relied upon.
You can read all about it over at the Clearspending site, and I hope you will -- in addition to an analysis that looked at millions of rows of data and found over a trillion dollars' worth of messed-up spending reports, we spent a lot of time talking to officials at all levels of the reporting chain. I don't think you're likely to find a better discussion of these systems and their problems.
And make no mistake, these systems are important.Continue reading
USASpending.gov got a face-lift on Wednesday evening, and it brought with it a raft of new features. Some of these are great; others are either not very useful, or an actual step backward. Let's run through them -- not only to highlight the features and shortcomings, but to examine what they can tell us about how government should be opening its data.Continue reading
You've already heard me complain about data quality -- how it's a bigger problem than most people realize, and a harder problem than many people hope. But let's not leave it there! Perfect datasets mostly exist in textbooks and computer simulations. We need to figure out what we can do with what we have. In this and other posts, I hope to give the developers in our community some idea of how they can deal with less-than-perfect data.
The first step is to figure out how bad things actually are. To do that, we'll use some simple statistics -- those of you with a strong stat background can skip to the next entry in your RSS reader (or better yet, correct my mistakes in comments).
Recovery.gov relaunched yesterday, and we've spent some time playing around with the site since then. The verdict? Well, it's hard to say — the site's a bit broken. There are 404s all over the place, most gallingly on the data download page. Parts of the site seem like they work, but don't: the select boxes on the front page that provide filters for the map don't actually affect its behavior in any way. It's hard to see these glaring bugs alongside the totally-unnecessary link to Facebook and not groan (am I supposed to play Scrabble with Chairman Devaney?).Continue reading
For almost a decade, some divisions of the Department of Agriculture published the Social Security numbers of individuals who receive federal aid in a publicly available online database of government grants. The Farm Service Agency and at least one other agency within Agriculture included the nine digit numbers as part of the tracking number assigned to each recipient of government assistance, called a Federal Award ID.
Those tracking numbers were then published in the Federal Assistance Awards Database System (FAADS), an online compendium of “all types of financial assistance awards made by federal agencies to all types of recipients,” which is updated quarterly. This database is generally used by experts and is not very user-friendly.Continue reading