Classifying changes to public access to information on US government websites
What did this government website look like yesterday? A month ago? A year ago? Why does this URL redirect? When did this link stop working? These are questions we rarely ask when browsing the Web, but the answers can be important, especially as we seek to hold the United States government accountable for what information is made available on federal websites, how agencies choose to present that information, and what alterations to content and access are made.
Even when you do decide to ask these questions, the tools to clearly understand changes to Web content and websites can be difficult for the public to use – and the language to describe the technical aspects of these changes is often not well defined.
There has been ongoing confusion in the media about the nature of the many changes to federal websites and removals of information that have occurred since the beginning of the Trump administration.
There are many ways that Web content can be altered. We’re talking about much more than just data removal, which has occurred in only one confirmed case since the beginning of the Trump administration. Most changes can and have been made to other types of content, including:
- Ephemeral information, such as text and images, that have been altered
- Documents that have been moved to new URLs or removed altogether
- URLs that have been redirected, sometimes to pages containing unrelated content to the page at the original URL.
In an effort to understand and document the ways in which federal websites change, we’ve developed a classification of Web content alterations and changes in access to Web resources. Using this classification approach, our goal is to clearly document changes across federal websites in order to inform the public, the news media, and lawmakers about the types of changes that are really occurring and help advocate for better systems for storing, managing, and presenting public Web content.
We hope our classification system can help direct the concern and attention this subject has gotten toward addressing the types of changes that have actually occurred and determining solutions for maintaining public access to public information.
8 classes of reductions to public access to public information online
Below, we’ve outlined eight general classes of website changes that go beyond standard Web maintenance procedures, and correspond to the “Classification of Non-Maintenance Web Content Alterations and Access Reductions to Web Resources” that we’ve established.
- Altering or removing text and non-text content
- Altering or removing links
- Moving an entire webpage or collection of webpages or establishing redirects
- Altering or removing an entire pertinent section of a webpage or collection of webpages
- Removing an entire webpage or document
- Overhauling or removing an entire website
- Altering or removing searchable Web portals
- Altering, removing, or deleting datasets
In our website monitoring classification, we dive more deeply into the technical aspects of each of these classes, detailing what mechanisms must be implemented to make each type of change. Whenever possible, we provide examples for the various classifications, largely drawing from the collective work of the Environmental Data & Governance Initiative’s website monitoring efforts.
What about adding new information or increasing information access?
You’ll notice that the classes above don’t capture information about the addition of content or increase in information access.
It’s important to differentiate between changes that remove public information and changes that introduce new information because they aren’t symmetric. Removing information or access can have important consequences and our policies and standards for governing Web management should reflect such differences.
For instance, removing a series of factsheets about the significance of clean water for different stakeholders might impede the public’s ongoing use or dissemination of those resources. Producing new factsheets of a similar type, however, wouldn’t directly get in the way of current processes or plans.
Federal Agencies should be held to a particularly high standard and be asked to explain why a resource needs to be removed from public access, especially because a removal can be confusing to the public. It’s often not clear if a removal indicates that the resource is now considered out-of-date, if the agency is just clearing space on their servers, or if there are other reasons for the removal.
Moreover, removals for potentially legitimate reasons can still cause alarm when they become confused with errors in the handling of Web content or intentional obfuscation.
For these reasons, we decided to establish a separate “Classification of Web Content Alterations Relating to Web Maintenance and Expansion of Access to Web Content” to help understand what to expect in these circumstances. This classification will be substantially expanded in the future and examples for individual classes will be added.
Agency notification and Web content management
While updates to Web content often occur as part of normal and expected Web maintenance, government agencies and offices manage their website updates and post notice about changes differently.
In addition, as agencies alter Web content and reduce access to Web Resources, they have different approaches to either explaining or obfuscating the nature of what has been changed. They also make use of agency archives in different ways, to enabling different levels of access to resources that have been removed or altered.
To address the technical mechanisms relating to agency practices for posting notice and archiving resources, we decided to create three separate “Classifications of Approaches to Changes and Access Reductions”:
- Assessment of continued availability of affected Web resources on agency websites
- Classes of storage of affected Web resources in agency Web archives
- Assessment of notice provided or explanation of change by agency
The goal of these classifications is to provide additional context for a change:
- If a resource was removed, is there another website hosting that resource elsewhere, in an archive or just on another website?
- If the resource is stored in an agency Web archive, is it searchable or not?
- And finally, is the agency communicating about these changes and pointing to archived content, and, if so, is it communicating proactively or reactively?
Understanding these aspects allows us to hold our government accountable and advocate for better Web resource management that keeps the public informed and able to access important resources.
Putting the classification to use to “track changes” to federal .gov
Our website monitoring classification starts with broad categories of changes and breaks them down into more specific cases, allowing for granular coding.
For example, class [2] corresponds to “Altering or removing links,” and class [2a] corresponds to a special case: “Not updating a link to a page that has been moved to a new URL.”
Class [5] corresponds to “Removing an entire webpage or document,” and class [5c.ii] denotes a webpage or document removal when “The previous URL redirects to an existing or new URL for another page” and “The page is related yet substantively different.”
Using this system, we can now systematically classify how federal Web resources are being changed and managed by various agencies and offices. We’ll soon be launching a tracker to more comprehensively document federal website change using our classification system.
Take, for example, the changes made to the Department of Energy’s Office of Technology Transitions website this year. Links to external clean energy webpages were removed, corresponding to class [2c]: “Removing a link to a page that has not been permanently removed or has not been removed at all.”
Or consider how Environmental Protection Agency (EPA) overhauled its Clean Water Rule website, establishing a redirect from all the previous webpage URLs to the home page of the new website hosted at a new URL, which corresponds to class [6d.ii]. The EPA did proactively announce the launch of the new website and most, but not all, of the content was archived on the EPA’s Web archive.
The U.S. Geological Service (USGS) made changes to its Science Explorer, a search portal that links to resources hosted elsewhere on the USGS website, removing search results from the search output. These changes correspond to class [7a.ii]. USGS commented about these alterations following media attention and only after the changes to the portal were made.
In the one case in which data sets were removed from the government’s Web presence, the Department of Agriculture removed animal welfare reports and pages hosting these reports, which corresponds to class [8a]. While the agency announced the removal of content proactively, no public Web archives were made available by the agency.
While we hope our classification approach will enable more public discourse about these issues, we also understand that it’s just a first step. Understanding and differentiating between the more technical aspects of changes to government websites is important, but really understanding the significance of a change and its policy implications requires both more information and subject-matter expertise:
- How many members of the public use that Web resource and how regularly?
- Do companies or civil society organizations rely on the resource for important functions?
- Do other government officials rely on the resource?
- Do scientists need this resource to enable future research?
- Do important stakeholders have their own copies downloaded or can the public access the resources elsewhere on the Web?
These are important questions we intend to consider over the coming weeks, building on the insights that we draw from our classification system.
How to improve Web management and increase government accountability?
The federal government manages websites, datasets, and other Web content, presenting public information for the public good. Agencies often have broad discretion over how they can change their respective websites. In particular, as administrations change, programs start and end, new laws are passed, new policies are put in place, priorities shift, and the overall work of the federal government continues, websites and Web content are changed as well. We have already seen sweeping changes to federal environment, climate, and energy websites since the start of the Trump administration.
Changes to Web content can be especially confusing and opaque to the public when agencies don’t proactively document and explain how and why they change their websites. Certain changes to websites can significantly limit access to valuable public resources that are paid for by taxpayers and, often, the public does not know when such actions are being taken.
By clearly laying out how agencies are managing Web resources and changing websites, our goal is to inform the public and lawmakers, gain insight into better systems for digital resource management, and provide information that can help keep our government accountable.