Using Technology To Assist Declassification


by Sunlight Foundation policy intern Melanie Buck

The role technology can play in streamlining the declassification process was the topic of a Public Interest Declassification Board meeting on Thursday, Sept. 23. The PIDB is an congressionally-established advisory committee that works to facilitate public access to national security-related records. It is considering how to advise agencies on their efforts to declassify approximately 410 million pages of records by December 2013. An agenda for the meeting is available here [PDF].

The Board heard presentations on the feasibility of using an automated computer systems to streamline document review. The speakers were Jeff Jonas from IBM, Tom Lee from the Sunlight Foundation, and John Verdi from the Electronic Privacy Information Center.

Jeff Jonas outlined a hypothetical automated system that would tag documents based on key words and phrases to make predictions about whether a document should be declassified. A high level of accuracy would come from training the system on documents that have already been reviewed, combined with determining how the new document relates to the old information, in a process known as “context accumulation.” He analogizes context accumulation to solving a jigsaw puzzle in this blogpost. The technology already exists, but would take some time to implement.

Tom Lee described the requirements for determining whether a document should be declassified, focusing on how a computer system could help prioritize the work queues of reviewers. For example, pages that the system determines most likely to be sensitive can be reviewed first, and if the system is determined to have made a correct judgment, the rest of the document (and potentially the document series) can be removed from the work queue. A search algorithm could be trained to return sophisticated results that would give human reviewers a clear indication of the content of a given document. Such a system would involve a static up-front cost, with additional computational costs varying on the system’s operational speed.

John Verdi approached the issue from a policy perspective, explaining his evaluation of what transparency groups and the public want from a declassification process. He suggested that the preparation of unclassified summaries adds work without adding much public value and is an unnecessary burden on the declassification process. In his opinion, resources would be better spent reviewing entire documents and declassifying whenever possible. He also discussed a few transparency tools that many hope to see, such as a large, openly-accessible searchable database of declassified records.

The Board has another meeting scheduled for Nov. 9, 2010 to further investigate ways to facilitate declassification.

Declassification has been a current focus in Congress as legislators promote a cultural shift from “need to know” to “need to share.” Just last week, Congress sent the “Reducing Over-Classification Act,” H.R. 553, to the President for his signature. Among other things, the legislation requires the Director of National Intelligence to establish policies and procedures to identify the classification of portions of information within an intelligence product, hopefully thereby facilitating automated review.

We summarized a July 22 discussion of the declassification of historical congressional records here.