A new way to track Web censorship under Trump: Gov404

by , , and

Today, WIP is launching Gov404: The Web Integrity Project’s Censorship Tracker, a new tool to track unjustified removals of online resources and reductions in access to content across the federal government.

Why are we aggregating these unjustified removals? As the government itself states in its Office of Management and Budget’s memorandum on how federal agencies should manage digital content:

“Federal Agency public websites and digital services are the primary means by which the public receives information from and interacts with the Federal Government. These websites and services help the public apply for benefits, search for jobs, comply with Federal rules, obtain authoritative information, and much more. Federal websites and digital services should always meet and maintain high standards of effectiveness and usability and provide quality information that is readily accessible to all.”

We couldn’t have put it better ourselves, but theory and practice are not always the same. To us, having “high standards of effectiveness and usability” means that agencies build trust with the public by proactively communicating about significant changes, justifying extensive removals, and establishing robust public archives. Unfortunately, the Trump administration has failed on many of these counts, undermining public information and the institutions that have been in place to make information publicly available. It’s now more important than ever to understand how censorship of website content has brought us so far from OMB’s stated framework.

What can I learn from WIP’s newly launched Gov404?

In order to point out how agencies are failing to live up to OMB’s guidance and numerous other Web governance policies, we have aggregated and verified the most significant cases of online information removal and censorship since the last presidential election on November 8, 2016. Although shortcomings in agency Web management existed long before, we’re picking that date to reveal how the government’s digital presence has evolved since the Trump administration transitioned in and enacted its policies. Gov404 especially aims to reveal how one-off and patterns of agency priority and policy shifts under the Trump administration are reflected online.

The changes that make it into Gov404 are examples of unreasonable information removals or access reductions. They are more than simple content alterations or standard website maintenance. In other words, we’ve focused on cases where the government has attempted to hide content that it doesn’t want the public to see, sought to sow doubt about a policy issue or generated unnecessary confusion, or obscured information that itself had significant public value. The Tracker also highlights when agencies failed to proactively communicate about a change in advance and when proper public archives were not established.

As new findings emerge, we’ll keep expanding on our list, which so far comprises 50 cases that meet our threshold. Our list already includes four dataset and search engine takedowns, thirteen website removals or overhauls, and 33 document or webpage removals, compiled from more than a dozen different departments. We spotlighted three striking findings, which we thought could use more attention and for which our own additional analysis yielded new results, in an accompanying blog post here.

Gov404 allows users to sort and filter cases by the manner in which the websites were altered using WIP’s newly-implemented classification of website changes. WIP’s classification characterizes all the ways in which a website can be modified, from the removal of links that lead to live pages to the takedown of entire datasets. In so doing, we reveal common issues with Web management across the federal government. See our “About” page tutorial to learn more about how to make the most of the various Tracker features.

Policy shifts under Trump

A number of stark policy shifts that have emerged under the Trump administration are reflected by agency website alterations and are pulled together by Gov404. At the Department of Health and Human Services, websites, documents, and entire sections about the Affordable Care Act directed at policymakers, lawyers, and the general public have been quietly removed without notice. Environmental and scientific agencies, especially the Environmental Protection Agency (EPA), have systematically dismantled climate change information compiled across many websites over more than a decade. Not least of the concerning emerging trends has been the deprioritization of LGBTQ individuals and their health needs, which has manifested on websites through the removal of numerous health and policy resources. See below for a more extensive overview of findings and trends.

The lack of agency transparency in managing online resources has made it necessary for civil society to step-in to reveal obfuscation of information by its own government. The cases we’ve aggregated came from website monitoring operations and investigative reporting by our own team, the news media, and other government accountability organizations, like the Environmental Data & Governance Initiative’s Website Monitoring Team and the Project on Government Oversight. When we thought more analysis was necessary, we put together additional documentation, which we present at the bottom of Gov404, to corroborate and expand on others’ findings. Sometimes, this process of documentation revealed new website changes that haven’t been published publicly before. Findings are verified using the Internet Archive’s Wayback Machine, a non-governmental organization that systematically takes snapshots of the Web.

Web governance

There are a number of laws, rules, and guidance that apply to federal Web governance, but all clearly fall short in protecting the public’s maintained access to public information given the extensive censorship examples we’ve aggregated.

To name just one systematic failure, the Paperwork Reduction Act calls for “adequate notice” before public information is significantly altered or removed. The Office of Information and Regulatory Affairs reminded all federal agencies as much in early 2017.

Yet, in only one out of 50 cases documented in the Tracker did the given agency alert the public of forthcoming website changes in advance. In only three other cases was notice even provided at all (in each of these it was posted at the same time that the removals occurred). Failure to provide notice creates public uncertainty about the meaning of a Web removal, especially when removed information relates to vulnerable communities or pertains to sensitive issues like personal health. In these cases, it’s not clear whether: 1) removed information was actually inaccurate, 2) the website is undergoing maintenance and the content will return, or 3) the removal was motivated by a quiet change in underlying policy, unrelated to the information’s accuracy, but which hasn’t been communicated transparently.

When we suggest to agencies that communicating with the public about significant changes to information is an important way to build trust and reduce confusion, we get varied responses. In some cases, agencies engage in constructive discussion and restore removed content, perhaps realizing that the removals are actually detrimental to the public. In other cases, we get pushback from agency spokespeople who defend the removals by citing reasons such as the content is out-of-date or traffic to the resource is too low to justify its upkeep.

Providing notice is a good start, but it is not enough on its own. In some cases, the archiving practices corresponding to a given removal are woefully inadequate, too, especially when those archives are effectively inaccessible to the public. Twenty-seven of the 50 website changes documented in Gov404 were not preserved in a public Web archive; in five additional cases the archives were incomplete in a significant way. Even when public archives do exist, they are rarely easily accessible to the public or well-marked and linked to on agency websites, diminishing their value as a possible anchor point for the public when websites are altered.

In many cases, while a lack of notice and inadequate archives sow further confusion, there’s just no rational justification for the removal of the content to begin with. Best practices for when and how agencies can remove content that are actually followed and enforced are needed. WIP is working on recommendations and policy frameworks that will help inform Web governance policy, which will guard against these systemic issues that the Tracker spotlights.

Government websites that are for the people

As the Trump administration undermines public records operations at agencies, subverts access to past administration’s records during confirmation hearings, and quashes federal reports about public health, the censorship of information across the federal government’s websites has become an important battle line in the fight to protect public information.

When public access to public information is undermined, there are real harms that result from the public losing information about rights or benefits or from confusion about information accuracy. Unfortunately, we’ve documented many instances of the removal of content, often prepared for vulnerable populations, that may results in real harms — including attacks on information relating to: women’s health, LGBT anti-discrimination rights, and applying for asylum in the US. We hope Gov404 will bolster government oversight efforts, by Congress, civil society, and the media, by highlighting some of the most serious cases of Web censorship under the Trump administration, and how that censorship has been  carried out.

One thing is certain, though: the Web is here to stay. So we don’t  just want to look backward. As a part of the government accountability community, we also want to work collectively with federal agencies, lawmakers, and civil society partners to find long-term solutions that will ensure the robustness of how public information is managed and disseminated for the public good.

Below, you’ll find an overview of some of the trends in Web censorship that we’ve seen emerge as we’ve compiled the Tracker. In beginning to analyze trends in the types of changes we’ve seen, and the nature of the content that has been changed, we’ve begun the work of identifying those long-term solutions.



Overview of findings and trends

In aggregating 50 cases of information removals and access reductions so far, Gov404 points at trends in both agency policy shifts, as well as, patterns of poor Web governance. While this Tracker does not intend to be comprehensive in time or across federal agencies, here are some revealing findings and trends that we identified at the time of the Tracker’s release:

  • What are some of the most significant shifts in priorities and policies revealed by the Tracker?
    • Affordable Care Act information, including websites, documents, and entire sections, primarily at the Department of Health and Human Services (HHS), was quietly removed without notice. The resulting changes reduced access to information about the law or how to access services provided under the law. See Tracker items: 9, 13, 31, 36, 43, 44.
    • Scientific and policy information about climate change has been systematically censored, most extensively at the Environmental Protection Agency, but also at State, Interior, Transportation, and other department websites. This trend was documented in-depth in a report by the Environmental Data & Governance Initiative. See, also, Tracker items: 3, 22, 23, 24, 27, 28, 29, 30, 34, 35, 38, 42, 50.
    • LGBTQ health and policy resources have been removed from websites focusing on health, labor, housing, education and other areas. See Tracker items: 2, 5, 14, 33, 36, 46, 47, 49.
    • Women’s rights and health information has been selectively removed from the Department of Justice and HHS websites. See Tracker items: 10, 21, 25, 32, 33, 36.
  • For how many changes have agencies provided advance notice? In only one out of 50 cases documented in the Tracker did the given agency alert the public of forthcoming website changes in advance. In only three other cases was notice even provided at all. Those three cases concern the EPA’s climate change, Clean Power Plan, and Clean Water Rule websites, which were removed or overhauled on the same day as the notice was posted. So, while that is technically advance notice, it’s certainly not “adequate notice” as the Paperwork Reduction Act mandates.
  • How many times have datasets or search engines been altered or removed? 4 times. In one case (Tracker item: 6), the lack of an established dynamic, functioning archive of the search engine meant that, in addition to making content inaccessible, the advanced features of the search functionality were entirely lost. The three other cases (Tracker items: 39, 41, 48) are particularly significant because archives were not established at all, so the resources they stored were made entirely inaccessible from the .gov domain. Moreover, non-governmental groups, such as the Internet Archive, were not able to set up third-party archives for all of these removals.
  • Are there public archives available for the removed content? It varies a lot. The Internet Archive’s Wayback Machine, a third-party non-governmental service, has most federal webpages publicly archived. But that’s no replacement for the government’s own public archives, which run the gamut. In some cases, government-established archives are entirely or almost entirely complete. The EPA’s January 19 snapshot, is a good example, even if there are certain significant portions missing. Unfortunately, in many other cases, such as the removed HHS’s National Guideline Clearinghouse and WhiteHouse.gov’s “1600 Daily” archive, no public archive has been identified. As we mention above, it’s rare that datasets or search engines are publicly archived. Overall, we found:
    • No public Web archive for 27 Tracker items
    • Incomplete or partial archive for 5 Tracker items
    • Complete archive for 18 Tracker items
  • When did most of the removals occur? It’s hard to say for certain. First of all, a lot of the findings in the Tracker have been compiled from ad-hoc sources, rather than systematic monitoring operations. That makes it difficult to say what website changes we may have missed. Examining the items that did make it in the Tracker reveals that approximately half of the changes occurred within the first year of the Trump administration. For instance, one of the biggest changes we’ve seen, the removal of the EPA’s climate change website, happened earlier on in the administration, which makes sense given that it’s purpose, which was stated, was to reshape the EPA’s entire image to comport with the new administration’s views. Removals of public information at that scale, and by a new administration, make a real statement.But changes are not all the same — for example, a website or dataset removal should not be directly compared to a small removed section on a webpage. Until a more systematic analysis is conducted, we generally avoid making claims about any patterns in time.
  • Which agencies or departments removed the most information under the Trump administration? We don’t really know. Since most federal websites have not been systematically analyzed, there may be a lot of removed information that we just don’t know about. WIP and EDGI have been the biggest contributors of significant website change findings over the last two years, focusing their monitoring operations on health and environmental agencies, respectively. For that reason, the Tracker over-represents findings from HHS, the EPA, and other environmental agencies. Many of the other Tracker items were discovered on an ad-hoc basis, and not through systematic monitoring operations.

Sign-up for WIP updates here.