According to census figures, the U.S. has over 91,000 state and local governments. The vast majority of these institutions produce documents that affect the citizens they serve. While many individual governments offer document repositories and diligently respond to public records requests, disclosure is inconsistent and not standardized. This poses challenges for researchers who wish to compare multiple governments.
California Common Sense (CaCS), a think tank focused on data-driven policy analysis, encountered this problem and has devised an interesting solution. CaCS is collecting a large number of local government documents, storing the documents on the cloud and making them available to other researchers. The group is also encouraging governments, researchers and citizens to contribute additional documents for hosting on the CaCS cloud.
The CaCS Open Records Initiative (ORI) launched late last year and hosts government documents produced in all 50 states. CaCS initial focus is on government financial documents, especially budgets and audited financial statements (often known as CAFRs). Other areas of interest include actuarial reports from public employee pension systems, school performance data and information on prison populations. The initiative is currently hosting 71,000 documents, and CaCS expects this number to reach 150,000 by the end of 2015 as team members continue collecting documents from governments nationwide. PDF documents lacking embedded text are being run through an OCR process, so all posted documents should be keyword searchable.
To obtain the documents, the CaCS team is reviewing local government web sites and state-level resources, downloading all relevant files and transferring them to the online ORI repository. If team members are unable to locate documents for a given entity online, they make an effort to contact someone from the government.
Rather than just focus on larger cities and counties, CaCS is including as many jurisdictions as possible with a population of 700 or higher. This cutoff was the result of statistical analyses CaCS staff performed on a sample of jurisdictions. They found that governments overseeing populations of 709 or greater had an 86 percent chance of having two or more methods of contact, whereas those below the threshold had only a 29 percent chance of having more than one contact method. As CaCS Executive Director Autumn Carter notes: “Being able to contact a city or town is the first natural step to filing a public records request.” Carter also told me that CaCS will include documents for smaller towns on a case-by-case basis.
CaCS is hosting the documents using technology provided by Box.com, a popular file sharing service. Box provides substantial discounts to civil society organizations and other nonprofits through its Box.org program. An advantage of the Box web interface is that it allows users to view PDF documents without downloading them. A user can thus decide whether a document meets his or her needs before committing bandwidth to a full download.
Box also provides a document filtering mechanism, but searching the repository remains a bit of a challenge. CaCS plans to provide a document manifest and other tools for quickly locating individual documents in the repository in the months ahead. It already provides an intuitive hierarchical structure for the documents, so locating desired files is relatively straightforward — although the process requires multiple clicks.
CaCS is hoping that the research community and local governments will also contribute to the initiative. In the case of governments, Carter suggests a possible advantage. Smaller governments with limited funds and technical expertise do not have to build their own document repositories. By contributing their documents to ORI, they ensure that their public data is online and can even refer public records requesters directly to the CaCS repository.
Those wishing to contribute documents to ORI have three options for doing so. If they have a small number of electronic documents, they can email these to CaCS at firstname.lastname@example.org. Larger sets of files can be zipped and uploaded to http://cacs.org/shared/dropoff.html. Hard copy documents can be mailed to CaCS at 2483 Old Middlefield Way, Suite 210, Mountain View, CA 94043.
Interested in writing a guest blog for Sunlight? Email us at email@example.com