DOJ’s FOIA Metadata Strategy Makes Sense


Department of JusticeThe Department of Justice deserves some applause for its plan to improve public access to FOIA materials. This has been in the works for a while: DOJ’s Open Government Plan (PDF) rightly noted that

the volume of information available can make it difficult for interested persons to find the particular information they seek. Especially when it comes to FOIA disclosures, a uniform system is necessary to allow for easy discovery, identification and retrieval of information.

One could be forgiven for assuming this would lead to yet another monolithic .gov dashboard project that no one would actually use.

That’s not the route they’re taking, though. Instead, the plan emphasizes adopting technology that makes FOIA materials available “through commercial search engines that are already used by millions of people every day”. The announcement goes on to note that:

Today the public is accustomed to using commercial search engines to find information online simply by entering key words. Agencies can ensure that such web searches effectively locate proactive disclosures […] while at the same time, retaining their ability to post their records on their individual agency websites in an organic manner that serves the needs of the frequent visitors to their sites.

This is all exactly right. Government should bring its data to where users already are, not waste time building new destinations for them to discover.

How is DOJ going to get its data into these search engines? Well, they’re planning to use metadata standards — specifically Dublin Core. Like a lot of people who’ve worked as web programmers, I get a bit squeamish around Semantic Web technologies. When this stuff works, it’s basically magic. But it’s true that it’s on my diagnositic list of “signs a project won’t succeed” (I’d put it somewhere between “planning to roll their own web framework” and “lead developer just checked into rehab”).

Still, there’s no question that this technology is designed for just this sort of use. I have my doubts about the utility of the proposed new “FOIA” metadata tag for ordinary users, but it could be tremendously handy for those working on FOIA oversight.

The wisdom of this particular technical plan will likely be decided by the document workflows themselves: if documents are being published by their authors with tools that can’t be made to insist upon metadata, it’s unlikely that this strategy will do anyone much good.

But I suppose the people at the Department of Justice know a thing or two about getting people to comply with rules. Good luck to them.

Photo by Peter Eimon