How an open license can encourage use of open data


Screenshot of Boston's data portal

Here in Boston, our journey to reimagine open data for our city government has led us to adopt the Public Domain Dedication and License, which we believe will help facilitate reuse of our data.

As part of our team’s wide-ranging outreach effort to democratize access to the city’s data, we hosted Summits to communicate the value of open data to the city’s newly appointed Data Coordinators. The first of these Data Coordinator Summits took place at the Boston Public Library in March 2016. It was at this first summit that we realized the importance of licensing in our efforts to broaden the value of open data in Boston.

One of our first presenters was Jake Wasserman, who shared how his mapping company, Mapkin, leverages open data to provide hyper-localized directions to users. His demo and presentation intrigued the audience, but it was his closing remarks that stuck with us. Jake encouraged the city’s data publishers to clearly designate license terms to make sure companies can easily use and build software with open data. As he explained, the worst-case scenario is to build a civic-tech product that cannot be used due to restrictive or incompatible licenses.

Setting goals

Based upon this feedback, we assessed the city’s prior licensing efforts and set forth some goals we wanted to achieve with the city’s next-generation open data platform. We wanted to set data-licensing terms that were:

  • Consistent across most or all of the datasets available.
  • Clear to users how they can use the data.
  • Interoperable with other common data licenses.
  • Open to broad use with minimal or no restrictions.

We realized our existing open data licenses were inconsistent with these goals. Our datasets were published with varying license terms and sometimes lacked licensing information altogether, creating confusion for our users.

We knew that a new approach was required.

Thankfully, we were able to enlist the aid of Harvard Law School’s Cyberlaw Clinic to help us evaluate options and implement the most appropriate next steps.

Is public data copyrighted?

Before we could choose a new license, we had to settle a foundational legal question: Does the City of Boston have copyrights over its data?

This remains an open question for most governments. In Massachusetts, the Secretary of the Commonwealth instructs that “records created by Massachusetts government agencies […] are not copyrighted and are available for public use.” It is unclear whether this opinion extends to sub-state entities like the City of Boston. Massachusetts Public Records Law further restricts the City’s proprietary rights in datasets and its ability to condition the datasets on any terms.

Moreover, the datasets are composed of facts — and facts are not copyrightable. While extensive research did not reveal any Massachusetts case law on the issue, a California court ruling concluded that public-records law extinguished the copyrightability of public records. The court ruled forcing requestors to agree to any terms was incompatible with the purpose of public-records law.

Choosing a license

It did not appear that Boston had any clear right to place restrictions on the data, so we decided to find a license that would make the data as open as possible.

Although we had originally considered placing terms that would require users to attribute any uses of the data to us, we realized that the value of data usage vastly outweighed that of attribution. As the City of San Francisco explained when they selected the Public Domain Dedication and License for their open data, “If you note us as a source, that’s awesome, but gosh, don’t mess up your [user interface] doing it.”

We identified two licenses as suitable options:

Both licenses are in use by open data programs across the country, are easy to interpret, and can be used in conjunction with other datasets and licenses.

We selected PDDL for two reasons:

  • PDDL is used by more of the open data portals that we surveyed.
  • PDDL is more explicitly designed for and applicable to data; by contrast, we use CC0 for the source code of and

Moving forward

We hope that the new license will foster a vibrant community of open data in Boston, and enable diverse uses of the data for companies like Mapkin and many others.

If you have any feedback about our data licensing or anything else, please share it by filling out our simple feedback form.

Interested in writing a guest post for Sunlight? Email us at