About Gov404: The Web Integrity Project’s Censorship Tracker

Gov404: The Web Integrity Project’s Censorship Tracker aggregates and verifies examples of the most significant findings of online information censorship on the federal Web since November 8, 2016. The findings come from reporting by the Web Integrity Project (WIP) team, the news media, and other accountability organizations. Gov404 is the largest collection of federal Web removals and access reductions ever compiled.

The sections below provide detailed information on Gov404.

Expand All | Collapse All

What is Gov404?

Gov404 is the largest collection of documented changes to federal government websites ever assembled. The dataset compiles the most significant changes, including removals of content (including datasets, documents, pages, infographics, links, and text) and reductions in access to information on a federal website. It is intended to support journalistic and academic research by providing information on individual website changes and revealing broader patterns of Web censorship; provide evidence to drive congressional oversight of removals of public information; and serve as  a basis for understanding the need for and areas for improvement with regards to federal Web governance reform.

Gov 404 consists of two parts: 1) an interactive data table at the top of the page, which highlights key details about each finding, and 2) supporting material at the bottom of the page that provides comprehensive technical details about each finding in the table.

1) The interactive data table allows users to:

  • Explore additional information about an individual entry in the dataset by using the expandable menu (the plus sign on the left of each row). Users will find more detailed information about the website changes, including links to sources who discovered them, news coverage, agency acknowledgements, and additional reading.
  • Organize the entries using the sort function (the triangle icon at the top of each column). To get a feel for the data and a sense of the who, what, and when of all of the findings, users can sort by topic area, agency, the earliest and latest date changes could have been made, and type of content changed.
  • Search the entries using the search function (at the top right of the data table). Users can search for all of the findings in the table that include a given term by typing the term into the search bar. For example, typing “immigration” in the search bar will return all entries in which the term “immigration” appears.
  • Filter the entries using the filter function (the row at the bottom of the data table). Users can filter each column by relevant terms by typing the term into the bottom cell of a column. For example, typing “yes” in the bottom cell of the “Agency Notice” column filters the column so that only entries that have a “yes” in that column (i.e. findings in which notice of the change was provided by the agency) appear in the table.   
  • Download or copy the data for later analysis (using the icons above the search bar near the top of the page).  If users wants to store the data for their own analysis, the dataset can be downloaded in MS Excel and .csv format. The data table can also be printed or copied (as tab separated values).

The schematic below provides information about Gov404 and its features.

2) For every entry in Gov404, we provide supporting material from reliable sources, which includes links to technical documentation about the changes published by WIP, the Environmental Data & Governance Initiative (EDGI), other watchdog organizations, and media outlets. If the original source did not write a technical assessment, WIP has vetted the finding in the form of a “Tracker Item” (see one for example). These “Tracker Items” thoroughly document the changes and provide links to the different versions of altered or removed webpages from the Internet Archive’s Wayback Machine so that users can independently verify the changes.


Back to top

What Website Changes are Included in Gov404?

Gov404 includes changes to or removals of website resources that substantially reduce access to content which is relevant to website users or which reflects agency policy. Each change included in Gov404 must fill the following criteria:

  1. The change was documented by WIP or other reliable sources.
  2. The change was made to content that provided relevant information to users of the website or which reflected agency policy.
  3. The change amounts to a substantive reduction in access to content, including removals of databases, entire websites, webpages, documents, and sections of webpages.
  4. The change results in more or less information than what is clearly articulated in relevant policy changes. For instance, we would not include a webpage that was altered to include accurate information about a new and announced regulation, law, or guidance.
  5. The change must amount to more than language alterations; removals, additions, or alterations to words, sentences, titles, or link text are not included.
  6. The latest the change could have occurred was after November 8, 2016.

Note: Although language alterations are not included in Gov404, users can read about these types of changes, such as the removal  of the word “gender” from content about sex discrimination, in reports listed on WIP’s publication page.  

Gov404 is not a comprehensive record of every reduction in access to content since November 8, 2016. Neither is it a representative sample of changes to the federal Web. Website monitoring watchdog groups have focused on monitoring content about environmental and health issues, and thus changes to the websites of environmental and health agencies are overrepresented. Still, although not comprehensive or representative, Gov404 does amount to the largest collection of federal Web changes yet assembled.

We will be regularly updating Gov404 with new findings of reduced access to and removals of content on the federal Web as we discover them. If you know about a removal on a .gov site that we have not documented, please contact the Web Integrity Project at webintegrity@sunlightfoundation.com.   


Back to top

How do I use the classification system?

WIP has developed a system for classifying changes to websites, covering two broad classes of changes to Web resources:

  • Non-maintenance-related alterations and access reductions
  • Maintenance-related alterations and expansions of access

These classes of changes are further broken into a categorical hierarchy that reflects the extent of removals or access reductions.

Maintenance-related alterations and expansions of access are outside the scope of Gov404. These are changes we should expect any webmaster or agency to make in the course of maintaining a website, such as providing updated, current information; removing outdated or inaccurate information; or updating links to pages with changed URLs. Gov404 instead focuses on non-maintenance-related alterations and reductions of access.

In WIP’s classification, non-maintenance-related alterations and reductions of access fall into eight different categories, each corresponding to a column in the right-hand side of Gov404, defined by the type of content that was altered or removed:

  1. Text and non-text content (called “Text and non-text content” in Gov404)
  2. Links (“Links”)
  3. The location of a webpage or collection of webpages (“Moves and redirects”)
  4. Sections of a webpage or collection of webpages  (“Page section”)
  5. An entire webpage or document  (“Entire page/doc”)
  6. The existence or location of an entire website  (“Entire website”)
  7. Search engines and open data platforms  (“Search”)
  8. Datasets  (“Datasets”)

Each of these categories contains a series of more detailed subcategories that relate to technical details of the change, such as whether a URL for a removed page redirects, or the types of error notice that a URL returns. In Gov404, these subcategories are designated in the relevant cells. For example, the finding described in entry #29, which details the EPA’s removal of its Climate Change website, lists (b)(ii) in the “Entire website” column. “Entire website” is the sixth category in WIP’s classification, so this corresponds to change category 6(b)(ii) in the hierarchy:

6. Overhauling or removing an entire website (b) An entire website is removed or overhauled, and a significant portion or all of the website’s previous URLs redirect to a page that contains a statement that the previous information has been removed: (ii) The previous website is not replaced with new content.

Similarly, in entry #48, which describes changes to the USGS Science Explorer search function, a(i) is listed in the “Search” column. This corresponds to change category 7(a)(i) in WIP’s classification:

7. Altering or removing search engines and open data platforms (a) Altering a search engine or open data platform that provides access to documents, datasets, or information that are accessible elsewhere on the website: (i) Altering the search function and output such that the same search results are presented or prioritized differently.

For more information about WIP’s classification, see: “How to classify changes to government websites: A classification of Web content alterations and changes in access to Web resources.”

Back to top

What acronyms and abbreviations are used in Gov404?

Listed below are the department, agency, and bureau acronyms and abbreviations used in Gov404.

Departments, agencies, and bureaus:

Acronym or Abbreviation Department, Agency, or Bureau
BLM Bureau of Land Management
CDC Centers for Disease Control and Prevention
CMS Centers for Medicare & Medicaid Services
Dept. Ed. U.S. Department of Education
DHS U.S. Department of Homeland Security
DOD U.S. Department of Defense
DOE U.S. Department of Energy
DOI U.S. Department of the Interior
DOJ U.S. Department of Justice
DOL U.S. Department of Labor
DOT U.S. Department of Transport
EPA Environmental Protection Agency
HHS U.S. Department of Health and Human Services
HUD U.S. Department of Housing and Urban Development
NPS National Park Service
OPM Office of Personnel Management
SBA Small Business Administration
State U.S. Department of State
Treasury U.S. Department of Treasury
USCIS U.S. Citizenship and Immigration Services
USDA U.S. Department of Agriculture
USGS United States Geological Survey
VA U.S. Department of Veterans Affairs

Back to top

Why did WIP put this Tracker together?

At WIP our mission is to monitor changes to government websites, holding our government accountable by revealing shifts in public information and access to Web resources, as well as changes in stated policies and priorities. We created Gov404 to further that mission and to provide a public, easily accessible, and user-friendly resource to:

    • Reveal patterns in changes to federal websites
    • Support journalistic and academic research
    • Provide non-partisan evidence to drive congressional oversight
    • Provide a basis for understanding how federal Web governance can and should be reformed.

Back to top

Where do we compile findings from?

Gov404 compiles findings from watchdog organizations, such as the Environmental Data & Governance Initiative and the Project on Government Oversight, news outlets, and from WIP’s own work.

Back to top

Who put Gov404 together?

While the data contained in Gov404 comes from many organizations and news outlets, members of the Sunlight Foundation’s Web Integrity Project assembled Gov404. Material for the Gov404’s launch was put together by Toly Rinberg, Andrew Bergman, Rachel Bergman, Sarah John, Aaron Lemelin, and Jon Campbell. Gov404 is maintained by the current members of the Web Integrity Project.

Back to top

Who should I contact about Gov404

If you have questions, suggestions, or comments about Gov404, please contact the Web Integrity Project at webintegrity@sunlightfoundation.com.   

Back to top