California’s new OpenJustice initiative illuminates criminal justice data

Screenshot of
Screenshot of OpenJustice.

Transparency. Greater trust. A policy development process that’s rooted in clear, public-facing evidence.

These are some of the goals that governments hope to achieve through the implementation of an open data program. These same goals — achieving greater community trust and developing better evidence-based policy — are often strongly desired by agencies which are part of the criminal justice system as well. As a result, state and local criminal justice agencies are across the country are increasing their implementation of open data as a way to achieve improvements in these critical areas.

However, as with any other government agency, criminal justice-related agencies are concerned with the potential that things could go wrong in the course of opening their data to the public. They worry that information that’s presented neutrally might be interpreted in unfavorable ways. They are concerned that they will inadvertently release something that has not yet been fully reviewed — or worse, that they will publish data that they are not legally permitted to release.

Meanwhile, the benefits, while potentially large, still seem uncertain as the agency stands at the outset of the initiative.

It can be one thing to see how a city, state or agency has experienced the process of opening data long after the initiative has become routine and integrated itself into the normal governing process. It’s another, though, to see how it feels for criminal justice agencies just after the point of implementation.

We wanted to know: What is like for you in the period just after you’ve put up your data, if you are a criminal justice agency? To learn more about the immediate impacts, the unexpected lessons and the most interesting new things to come out of the implementation of criminal justice open data, we talked with Justin Erlich, special assistant attorney general at the California Department of Justice and the attorney general’s advisor on technology & data, about his experience working with the department’s new “OpenJustice” transparency initiative.

California Attorney General Kamala Harris formally launched the OpenJustice site on Sept. 2, providing individual-level and summary-level data (as well as dashboard-style visualizations) on three important issues:  law enforcement officers killed or assaulted in the line of duty; deaths in custody, including arrest-related deaths; and arrests and bookings. The site was lauded for being the first solely state-led open data initiative around criminal justice. This was no small undertaking, either — in addition to achieving multi-jurisdictional buy-in for the project, the project team had to figure out the best ways to make this very important information available.

In providing substantial amounts of individual-level data (and in a state which has more stringent legal privacy protections than many others), OpenJustice had to grapple early on with how to avoid publishing personally identifying information on the portal, while providing enough detail to make the data useful. The team elected to avoid the danger of identifying individuals by removing first and last names. In order to look at age-related issues, the data does classify cases as juvenile or adult, but redacts specific ages or dates of birth. Address information relates solely to location of incidents or relevant local agencies.   

Achieving goals

As is the case for many governments pursuing open data initiatives, the intention behind publishing data on California’s OpenJustice lay both in showing a good faith commitment to transparency as well as in improving data availability for better policymaking. In the first six weeks since the project went live, Erlich saw evidence of success on both measures. Social media response has been positive, and advocates have publicly expressed support for the project. The project has received support from local law enforcement as well, at least in part because of the way that it helps them understand how they are doing relative to other places within the state.

While OpenJustice has focused specifically on external outreach to potential data users, that strategy is already paying off. The team has been pleased to witness over 60,000 page views, high levels of data downloads and many comments from site visitors, demonstrating that the site is attracting a wide variety of users, including collegiate and research communities. As another metric of engagement with the site, the department has seen public records requests reference OpenJustice datasets, exhibiting that the site is helping the public become better informed about the department’s information holdings and helping requesters better target the information they want. Visitors have also made ample use of the site’s suggestion box and contact address, emailing questions that allowed the OpenJustice team to make adjustments after the site went live.

The OpenJustice team has also quickly been able to use the process of publishing datasets to produce new findings from the data. For example, working with researchers from UC Berkeley, the team took an early look at what their data revealed about racial disparities in deaths in custody. Sendhil Mullainathan, an economics professor, happened to be exploring the same issue using national FBI data for “The Upshot” in The New York Times. Mullainathan found that while African Americans were disproportionately likely to die in custody relative to their number in the total U.S. population, the percentage of African American deaths in custody was actually roughly equal to the proportion of African Americans in the population of arrestees. As a result, Mullainathan argued, the larger focus should be on racism in the laws and practices leading to arrest.

The OpenJustice team’s experience working on the same topic showed how better data — improved and made more available through an open data initiative — can help governments identify specific priorities that will maximize the effect of their interventions. Using the data that California prepared for open publication on OpenJustice, the team had also found that California’s racial disparity in deaths in custody was not independent of, but rather reflected, a racial disparity in arrests.

With the greater detail available in the state’s data, however, the OpenJustice team was able to go beyond what it was possible to do with national FBI data and become even more specific. Using the state’s data, the team first found substantial racial disparities in arrest rates — a phenomenon the FBI data also showed to be common throughout the country. However, as they further analyzed their data they discovered that a particularly large driver of racial disparities was occurring in the category of discretionary booking of juveniles. This finding provides precisely the kind of actionable insight that the state can use to develop better practices and reduce racial disparities in the criminal justice system.

While the OpenJustice team may not have originally focused on how their work would improve California’s criminal justice data quality, they have learned that this is part of the function of their work as well. In order to get good value from the data, analysts must be confident that they are using high-quality data. Meanwhile, in governments as elsewhere, data quality can be uneven, and therefore lower in quality, as a result of cross-source differences in collection and maintenance. As is the case with other open data programs we’ve observed, the process of preparing data for publication shines a helpful light on longstanding issues in data collection and review. The OpenJustice team discovered this issue for themselves in the publication of the state’s death in custody data, and they are already beginning to work with state and local data collectors to improve the quality and consistency of this dataset.

Maintaining engagement

Public response so far suggests that the site is achieving good progress towards the state’s goals. At the same time, we at the Sunlight Foundation have found that new open data initiatives sometimes cause public employees to worry that the opening of their data will somehow put them at a disadvantage. Since the OpenJustice data release currently focused primarily on California’s law enforcement data, we asked Erlich how California’s law enforcement community was responding to the site.

Erlich acknowledged that while a small group of law enforcement respondents had expressed concern about open data, most had been cautious but interested, and a number were also strongly positive about the initiative and saw it as being likely to help effect positive change. For the most part, Erlich felt that the primary request from law enforcement about the site had been for the OpenJustice team to include contextual data and remain in good communication with the local departments and agencies, letting them know about new site developments in advance so that they would know how to respond to inquiries. (For example, he related how one local coroner’s office had been taken a bit off guard by a sudden rush of public interest stemming from OpenJustice’s publication of state death in custody data, and wanted to avoid that kind of surprise in the future.) At the Department of Justice, Erlich provides updates to, and collects feedback from, the department’s 21st Century Policing Working Group in order to ensure that the portal stays connected to the department’s related initiatives.

Developing and enlisting internal support for OpenJustice was critical to its success. Getting internal buy-in across the department’s divisions was accomplished through one-on-one meetings, answering questions and getting feedback. The more the team met with internal actors, the more they realized that close coordination benefited the program, making it more useful and interesting across the board. Erlich noted this effort would not have been possible without the dedication and hard work of the staff in the Justice Department’s Criminal Justice Information Services (CJIS) division and its supportive leadership.

In order to maintain the continued, positive engagement with California law enforcement, Erlich felt it was important for the OpenJustice team to think ahead about how data could be interpreted in order to provide useful context for it. The site is intended to provide serious information, not serious surprise; Erlich said the department was committed to “providing all the context so there are no ‘gotcha’ moments.” Providing contextual data alongside key elements of interest will help ensure that the data is being assessed fairly. Some examples of contextualized datasets that we are likely to see coming on OpenJustice include:

  • Geolocated crime rates provided alongside geolocated arrest rates, to make it easier to assess their relationship.
  • Demographic data for the locality alongside incident demographic data, in order to be able to take local demographics into account when evaluating racial or other demographic disparities.
  • Provision of total arrest numbers alongside death in custody data, in order to be able calculate the rate of deaths relative to the total number of arrests and understand the parameters of the problem.  

Providing contextual data is important both for public officials and the public, and especially for encouraging thoughtful conversation about potential policy improvements. In addition to this, though, Erlich described other ways that the team was working on increasing outreach and engagement. They are presently conducting user research for improved design, including through developing focus groups to explore the needs of such specific users as law enforcement, journalists and students. The team is strategizing methods for developing power users by investigating ways to ease the state legislature’s use of their data as well as through identifying potential routine academic uses for the site. The OpenJustice team continues to seek other partnerships with researchers as well, ensuring that the special utility of open data — which is that it can be used by anyone for anything — is achieving maximum public value through becoming connected to the state’s highly skilled data users.

The site’s early success is leading the OpenJustice team to plan further expansions in terms of services that the team can provide to internal stakeholders and to the public. They have just brought on a dedicated data scientist and are seeking to create more partnerships with universities and independent researchers as they look towards opening many new datasets.

Adding yet another level of partnership, the team helped connect California’s criminal justice open data initiative with the national movement to open police data, becoming the first state-level partner to join the White House’s Police Data Initiative (PDI). By joining the PDI, the California Department of Justice committed to “work with law enforcement agencies across California to adopt open data policies, and … provide tools and resources to help them better utilize their data to inform and improve policy.”

We are looking forward to seeing what happens next for the site, especially in regards to whether other states and cities seek to follow the OpenJustice model.