House Oversight Subcommittee Discusses Problems with Data


On Friday, Ellen testified in front of the Subcommittee on Technology, Information Policy, Intergovernmental Relations and Procurement Reform, a subcommittee of the House Committee on Oversight and Reform. Her testimony mostly focused on the findings from our Clearspending project, which assessed the data quality of the grant programs in It was heartening to see the committee taking the issue of data quality in so seriously. While admittedly not a sexy topic, this issue has serious implications in decisions that the government makes about our federal spending. To quote Rep. Issa’s opening statement, “The failures to make the data right is the reason we’re not getting a responsible government”.

Clearspending found nearly $1.3 trillion dollars Clearspending logoin misreported spending in 2009. This includes spending reports that were late, incomplete or inconsistent with other information sources that track federal spending. In Ellen’s testimony, she discussed two specific examples of poor data quality in the Department of Education reported over $6 trillion in student loans for 2010 and the Department of Agriculture did not report any spending for the National School Lunch Program, which obligated $8 billion in grants last year. The CIOs from both these agencies also testified on the panel, and were given a chance to respond to our critiques during the committee Q&A.

Chris Smith, the CIO of the USDA, testified that the reason the grants were not reported was because they went to individuals, and the law governing grant reporting does not require reporting for grants to individuals. However, the actual program description describes these grants as formula grants to states. The entity receiving the grant is a state, not an individual, and therefore the grant is subject to the reporting requirements. Smith also mentioned that the transactions were under $25,000 and therefore not subject to the reporting requirement. While this may be the case, it seems unlikely. The program in question has a $10 billion budget. Let’s say that each state gets an equal payment once a month. That would still be over $16 million dollars per transaction–not even close to the $25,000 minimum. It seems that the reporting guidelines have been misinterpreted in this case.

Dr. Harris from the Department of Education testified that he thinks the student loan error may be related to incorrect aggregation of the individual loan records in He also said that when “[he] looks at the data in, it is accurate for the Department of Education”. I’m afraid we have to disagree. Let’s look at some screenshots:

usaspending timeline of student loan transactions

The real problem with this picture is the way displays loans. Loan records are supposed to have two vitally important fields: the overall value of the loan, and the subsidy cost of the loan. The subsidy cost takes into account the default rate and other possible sources of subsidies. Because totals the value of loan records on the subsidy cost field, the sum of all student loan obligations appears to be zero. The fact that none of the student loans have a subsidy cost is a major data quality issue unto itself. However, if you download a snapshot of the student loans from 2010, their face value adds up to over $6 trillion. You can see this discrepancy for yourself if you go to the right part of; for example, by searching for Education loans with face values greater than $1,000,000. Do so and you’ll see that there were 421,311 such transactions in 2010, for a total value of… $0 (but a non-zero bar graph!).

Dr. Harris chalked these discrepancies up to “aggregation errors” or to the fact that every time his department submits a loan transaction to, they are required to enter the face value of the loan, resulting in duplicate entries. But this only underscores the confusion surrounding OMB’s reporting guidelines. Each transaction in has a unique record ID and is intended to be updated, not duplicated.

It’s not our goal to assign blame for these errors to the engineers at, OMB, or agency staff–we just want the data to be fixed. The fact that glaring reporting errors like these can persist suggests that these problems won’t be fixed unless attention is drawn to them. Even then it will be important to ask whether the fundamental reporting process has been fixed, or if we’re just correcting the most easily-found errors.

The House Oversight Committee did a great job calling attention to these issues. I hope they continue working to make a meaningful oversight tool.