Why do we trust researchers with sensitive data?

(Photo credit: UCL Institute of Education)

As we explored earlier, the 1970s brought a range of new personal legal protections and restrictions on sharing individuals’ personally identifiable information (PII). Major privacy laws — such as HIPAA and FERPA — prevent health, education and financial data collection initiatives from sharing PII-containing data sets in most contexts.

The whole point of privacy laws is to restrict data-sharing. Privacy laws generally require that when data-holders would like to share information, they must obtain the consent of the data provider. For research using hundreds or thousands of individual data-points, this is impractical, and if this were the only way researchers could gather enough data to use then most research wouldn’t occur.

Happily for those researchers, most privacy laws also have exceptions in place for highly controlled research, especially where it will contribute to general knowledge. Even with the past 50 years of developments in privacy law, data-sharing for research has been made possible through equally substantial developments in subject protection. Over the past 50 years, the U.S. has developed norms and laws to protect human subjects of medical and behavioral research and created a legal architecture to clearly define what we mean when we talk about the research use of protected data. This standard for acceptable institutional use of protected data is high enough that it prevents many unaffiliated or casual researchers from accessing information; on the other hand, the high standard also allows governments to feel comfortable releasing data to institutions that live up to it.

The Birth of IRBs

Just as the revolution in federal privacy law occurred in response to fears of data misuse, the emergence of research review processes occurred as people learned more about ethical violations in medical research. In the mid-20th Century, the Nuremberg Trials of Nazi war criminals included testimony about grotesque experiments conducted by doctors on concentration camp internees. Dr. Henry Beecher, an American anesthesiologist who served in World War II, was struck by those examples, and then identified a number of studies conducted in the United States that he felt also breached ethical standards. He exposed these studies through talks and published a summary article on “Ethics and Research” in the New England Journal of Medicine. Spurred by Beecher’s work, the National Institutes of Health and the U.S. Public Health Service (PHS) announced increased ethical review processes for the work they sponsored: The 1966 Surgeon General’s Directives on Human Experimentation required that all new or ongoing work would need to be approved by “a committee of [the investigator’s] institutional associates [to] assure an independent determination: (1) of the rights and welfare of the individual or individuals involved, (2) of the appropriateness of the methods used to secure informed consent, and (3) of the risks and potential medical benefits of the investigation.”

Unfortunately, PHS apparently failed to use this directive to adequately vet its existing projects. One of their studies on the impact of untreated syphilis, conducted for over 40 years on a group of African-American test subjects in Tuskegee, Ala. who were falsely told they were receiving treatment, remained in place until a whistleblower exposed the study in 1972. The national response to this discovery led to congressional action. Through the 1974 National Research Act, Congress created a commission to determine the best practices for the protection of human subjects.

From this point on, federally-funded research projects were required to protect research subjects from risk and to inform them fully about the nature of the studies that included them. The National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research produced the Belmont Report, which required all federally-funded research institutions to achieve this through developing Institutional Review Boards (IRBs) that would evaluate and oversee all human subjects research. The commission articulated principles for IRBs which centered around protecting the people who were the subject of experiment and analysis, and led to a focus on reducing their risk as much as possible, including the strict protection of their privacy and confidentiality.

Federal adoption of these recommendations means that IRBs play an essential role in protecting individuals’ PII in research performed across the nation. Federal funding supports around 60 percent of research performed at universities and around 30 percent of all research conducted nationally. The broad significance of federal sources for funding mean that even research which is conducted without direct federal funding is still strongly affected by the norm of IRB oversight. All major research-conducting academic institutions have set up their own IRBs. For institutions without their own IRB, third party commercial IRBs can be approved by federal agencies to satisfy IRB requirements.

Further privacy precautions in research

In addition to federal guidelines, some states create their own authorities to regulate research methodology and access to sensitive information. California’s Committee on the Protection of Human Subjects (CPHS), a division of the Office of Statewide Health Planning and Development, conducts reviews subject to the state’s Information Practices Act before approving the use of personally identifiable data from state agencies for research. If the research is conducted using data from a state department, the agency will sometimes have its own regulatory board. For instance, the California Department of Corrections and Rehabilitation’s Research Advisory Committee must approve all research using inmate data independently of the state’s CPHS per its own guidelines.

This extensive network of laws and oversight means that researchers end up taking the confidentiality of their data very seriously. Some kinds of data are required to be maintained in compliance with FISMA standards, which in many cases involves a substantial hardware investment. Data security is a major element in IRB regulation and plans submitted by researchers must include a description of how they plan to physically secure the data, including through choosing appropriate hardware, locks and rooms where data will be kept and limiting the number of people who will be permitted direct access to the data. The guiding notion is to prevent release of PII wherever possible, which means that even IRB-approved data is stripped of major identifiers before being given to researchers if it’s possible to do so.

In many research contexts, the establishment of personal relationships also becomes an important intervening factor on the way to accessing sensitive data. First, many research grants are funded to support interorganizational and interdisciplinary cooperation. Creating the relationships necessary to engage in joint research is a social process that depends on mutual trust and confidence. Whether or not it is a primary factor, it seems evident from the researchers that we’ve surveyed in our work that developing a sense of mutual respect and comfort improves the likelihood that researchers and data-holding institutions will be able to develop good ways to work together, finding a productive balance between the researcher’s project objectives and the data-holder’s need for security.

In a future post, we will explore the legal context that permits the sharing of protected individual-level data. While IRBs and other review mechanisms play an important role in ensuring that research institutions are eligible for governmental partnerships, governments have their own internal requirements to address as well.