Case study in trying to analyze earmark data


Each year, Congress allocates billions in earmarks that come in the form of annual appropriation committee requests or are attached to various bills that become law. The Sunlight Foundation thought it would be interesting to examine which earmarks, after all the Congressional debate and bluster has dispensed, actually get spent.

We thought a good example would be the $23 billion in transportation earmarks from SAFETEA-LU, The Safe, Accountable, Flexible, Efficient Transportation Equity Act: A Legacy for Users. The act authorized spending on highways, transit systems, port facilities, bus routes and other projects from its passage in August 2005 to what was supposed to be its expiration in September 2009. Earlier this year, Congress renewed SAFETEA-LU temporarily, and then the Act was extended to the end of 2010 in the Hiring Incentives to Restore Employment Act on March 19. Nevertheless, most of the projects funded by the 6,306 earmarks in the original act should be completed or nearing completion.

Additionally, since the earmarks fell to the Department of Transportation, specifically the Federal Highway Administration and the Federal Transit Administration, it would seem easier to request such agency-specific information than other for other acts that include earmarks for multiple departments and agencies—for example, one appropriations bill funds the Labor, Education and Health and Human Services departments, plus a large number of related agencies.

So we approached Transportation and started asking about earmarks. After a bit of negotiation, the Federal Highway Administration disclosed data on unspent FHWA earmarks—we will publish that in a future post. The Federal Transit Administration also kept such records, but we had to file a Freedom of Information Act request to obtain a copy of their earmark and earmark grant report from the last five years. Specifically we asked for data under the government’s internal Transportation Electronic Award and Management (TEAM) system that showed “earmark identification numbers that are tied to an obligated project identification number.”

In government lingo, an “obligated project” means that there is a binding agreement in place for the federal agency to promise to pay for an approved project in a community. While there is a separate designation for when the money is actually spent, known as disbursements or outlays, for all intents and purposes, finding the obligated amount is a good indication that a project will be funded.

A few weeks later, we were delighted to receive a positive response from the FTA. But were not so delighted that it came in the form of a 121-page paper printout of what obviously is a database that could have been transmitted electronically. Our FOIA request, of course, asked for the information to come in an electronic format. So we asked again for an electronic copy, and were told that there was no way we would get it.

Realizing we would have to scan each page and use optical character recognition software (OCR) to turn paper into data, we put the project on hold for some months. When earmark season rolled around this year, we asked yet again for an electronic copy, and were told that it couldn’t happen. So we set to scanning each page and cleaning up the data. Anyone that’s tried to OCR a document might understand what a frustrating experience this can be. Not only do you have to match up columns and rows, you have to make sure every little detail was scanned accurately – so that a zero didn’t appear as the letter “O” or that a slight smudge didn’t suddenly become a decimal point.

And there was another issue: the data we have is six months old, and it is always possible that those earmarks without obligations might have been funded in the last six months. When we asked the FTA’s media relations staff if we could get an updated copy of the database, we were told that we had to submit another FOIA.

Several hair-tearing weeks later, armed with an OCR’d spreadsheet (albeit six months old), we began working with the data to see what could be analyzed. We realized quickly that we needed to talk to someone at FTA who works directly with this data to fully understand what we had. When we finally got hold of someone, we learned that the original response to the FOIA we submitted was sent to the FTA FOIA office as an Excel spreadsheet.

The FOIA office, following protocol, made hard copies of that spreadsheet, and sent it on to us. The reason for the use of paper, explained FTA spokesman Paul Griffo, is to ensure that if there is any information in the spreadsheet that was deemed not to be viewed by the public, a hard copy would allow the FOIA and press offices to black out or redact the data.

But nothing in my data was deemed redact-worthy, yet still I could not get an electronic copy. The program office graciously gave me an up-to-date electronic copy of the earmarks data and I marveled how one click of a send button in a government office could bring me such joy.

But upon analyzing what was, and was not, obligated, I ran into several stumbling blocks. I was given two spreadsheets: one that had 7,300 rows that were just earmarks, and another that was about 6,200 rows that was just grants. I was told that any earmark that showed up in the grants table and had a listed obligation date indicated that the grant was either partially spent, or all spent. Those without obligation dates were never spent. It’s still unclear what happened with those 1,000-plus earmarks that don’t appear in the grants database at all.

After calling several grant recipients with no listed obligation date, I learned that many of the grants had in fact been obligated. The data was telling me one thing, but the recipients from large metropolitan transit agencies from across the country were telling me something entirely different.

I did find some very interesting earmarks that never got spent – which will be detailed in accompanying posts, but without fully clean and accurate data, it would be irresponsible to write about this database in aggregate. The program office admitted that the data wasn’t perfect. Sometimes different people enter in different things. Sometimes you have to work with multiple internal databases that only FTA employees would have access to.

And after weeks of communication with the program office, I was told the magic words: That the FTA did keep a separate list of just those earmarks under SAFETEA-LU that never got spent. However, I would have to file another FOIA.

So this part of this story ends where it begins: The Freedom of Information Act. In the coming weeks we hope to get that FOIA response, and we’ll be sharing every agonizing moment with you – so that you too can learn the joys of obtaining and analyzing government data.