Following the Law


One of the most precise ways to follow an issue you care about is to identify where it sits in the law, and watch how people are attempting to change it for good or ill.

For example, as a transparency organization, we follow the Freedom of Information Act quite closely, which is codified primarily as 5 USC ยง 552. Up to now, we’ve been following it in Scout by searching for specific text strings, such as “5 U.S.C. 552” and “section 552 of title 5”.

But setting up multiple searches is inconvenient, and neither of these catch results that cite subsections – for example, some bills affecting exemptions to FOIA will cite “section 552(b)(3) of title 5”. Setting up searches for each subsection, in each common citation format, isn’t practical.

So we’ve made Scout’s citation searching much smarter: now, if you search for “5 usc 552”, or “section 601 of title 5”, you’ll see that it returns results matching a variety of formats and subsections in bills and regulations, all at once.

This makes keeping up with proposed changes to laws much more reliable and complete, and we hope you find it useful.

Under the hood

Technically, this is accomplished by pre-scanning all bills and regulations for any kind of citation to the US Code. We do this by running the text through Citation, a JavaScript library for citation extraction. It finds citations, pulls out an excerpt, breaks the citation down into its component parts, and assigns it a unique ID. Right now, it’s using just two regular expressions to detect citations, but it seems to catch a great many of them.

On Scout’s end, it looks to see if the search query looks like a US Code citation, converts it to the same kind of unique ID that Citation.js uses, and asks for any bills or regulations which have that particular US Code section ID associated with it. So, though the search results in Scout look similar to a full text search, it’s actually just filtering on an ID, and displaying the excerpts we already extracted from each bill and regulation.


There’s lots left to do here – there are some complicated US Code citations that aren’t yet caught, and we’d like to expand to other citation types (such as the CFR). Scout also only does this kind of special search for “simple phrase” searches, so it’s currently not possible to filter on US Code citations and another search term – so we’ll add that before long.

I expect that we’ll be publishing our extracted citation data in bulk, eventually. However, right now it still misses enough complex citations that it may not be useful yet as a canonical set of legal links, even though it’s clearly useful in an integrated search context.

But the code to extract them, Citation, is open source, and even slightly documented. Contributions are helpful and welcome – we hope that it can grow into something that everyone can use, so that this work doesn’t have to be done again.