OpenGov Voices: Hack the Budget! (or try to)


Disclaimer: The opinions expressed by the guest blogger and those providing comments are theirs alone and do not reflect the opinions of the Sunlight Foundation or any employee thereof. Sunlight Foundation is not responsible for the accuracy of any of the information within the guest blog.

This is a guest post from Anthony Holley, a member of the intrepid Hack for Western Mass team that spent the weekend of June 1 trying to track money reported in back to the federal budget. Anthony is a writer living and working in Amherst, MA. He is interested in helping good non-profits grow so that they can do their best work.

Advocacy groups like the Sunlight Foundation and National Priorities Project have long lamented the state of data gathered on, which remains the key, searchable data repository for those interested in learning about and educating others on our federal spending. Individual federal agencies are responsible for reporting their expenditures to in the interest of contributing to an open, transparent government. A working group at the Western Massachusetts Civic Day of Hacking sought to reconcile the information on with the information in the budget appendix, published by the Office of Management and Budget (OMB), keeping in mind that true transparency means being able to track expenditures from the budget to A group of computer programmers, data managers, and political activists got together to work on this problem over the weekend of June 1 and 2. What we found was that this goal was at least very difficult, and perhaps impossible, to achieve.

We started by looking at the data on the OMB website to see if we could parse it for useful identifiers that we could then match up with the data on The data on the OMB website turns out to be particularly user unfriendly for these purposes, presented in XML and PDF formats that are not easy to search by category. Each section of the budget has a Treasury ID, so we took that as our starting point for trying to match expenditures listed on the budget with

Hack for Western Mass

Hack for Western Mass team. Photo credit: Molly McLeod

For our first pass, we chose the National Science Foundation. We used the Treasury code to match the grants data in both USASpending and the budget appendix. This worked well, as we were able to find a close match, as far as the federal budget goes. Between the two numbers we were only off by a couple hundred million. This was a relatively simple example, since most of the NSF expenditures are in grants (something that a few members of our team had personal experience with) and there is a category of CSV file in the budget appendix that covers only grants.

This exercise did not work for attempting the same procedure with the Social Security Administration. This was because as far as was concerned, the Catalog of Federal Domestic Assistance (CFDA) codes are the only required identitifiers. Some agencies report both CFDA codes and Treasury codes, but some don’t. Also, while the CFDA codes are related to the Treasury codes it is very difficult to understand their exact relationship and thus a meaningful map cannot be made with them. In the time we had, it was not possible to relate the CFDA codes in USASpending to the Treasury codes listed in the federal budget. Because the CFDA codes are the ones that the agencies use for internal purposes, they seem to be most useful in determining the origin and purpose of grant spending. However, because they may not be reliably linked to Treasury codes, it is difficult to relate this information back to the federal budget.

We tried doing audits of other agencies with similarly frustrating results, but we are also in the process of expanding our understanding of how these relationships work. It may be that we chose the wrong path for trying to establish a link between the budget and the reports on USASpending, but the link between the CFDA and Treasury codes held the most promise at the beginning.

We are certainly open to suggestions on ways to establish links between these data sets. Our key recommendation is for the agencies to provide the relevant Treasury codes attached to the data and to do so in a uniform manner. Another issue we had is that sometimes the Treasury codes are in the same cells as prose descriptions of the programs listed. This makes the data virtually impossible to parse, and therefore a clearer picture harder to obtain. Uniformity in these listings would at least form the beginning of understanding how relates back to the federal budget.

Interested in writing a guest blog for Sunlight? Email us at