What’s missing from NYC’s updated Open Data Plan?


New York City recently released an updated Open Data Plan (on July 15, 2014), which was a follow up to their 2013 Open Data Plan that was released last September and the first plan released under the new Bill de Blasio administration. Last year’s Open Data Plan was the first for the city and delivered pursuant to New York City’s Local Law 11, which calls for an annual agency compliance plan to release all public data (as defined by the law) by 2018. If any datasets could not be released by that time, a reason for why they couldn’t was required to be listed. While every dataset that made it into the plan had a scheduled release date listed by December 31, 2018, we questioned what information was missing from this plan and if these scheduled dates for release (with many matching the final due date of the policy) were as ambitious as they could be. The 2014 updated plan, with removed datasets, mysterious additions, and zero explanation for overdue datasets has left us with updated questions.

Datasets that were removed because they don’t qualify as “data”

One of the core Open Data Policy Guidelines that the Sunlight Foundation has advocated for over and over again is a public comprehensive list of government’s information holdings. What New York City has provided in 2013 and in this 2014 updated Open Data Plan is an incomplete narrowly defined list of some of the data held by the New York City government agencies. If Local Law 11’s narrow definition of data did not make this clear on its face, and the 2013 Open Data Plan didn’t make it evident, the 2014 updates leave little room for debate, New York City does not have a plan to release all of its data, but rather a plan to release a specific set of information that qualifies as data under Local Law 11.

A strict definition might not seem problematic until you examine what government information is being excluded under this definition. One dataset that was removed from the 2014 updated Open Data Plan was Murder in NYC reports. With the recent analysis of the effectiveness of NYPD’s Stop and Frisk program on crime and the NYPD homicide ruling, proactive responsibly released Murder Report information (and any context it can bring) has an articulable public interest. It is unclear why Murder in NYC reports failed to meet the data definition, but submitting a Freedom of Information request to the New York Police Department for this non-dataset would likely yield answers.

Below is a full list of the removed datasets from the updated plan, with explanations for why they were removed, which has been released as an open dataset — props to New York City for releasing this information proactively in an open format on the portal:

The addition and removal of datasets from the New York City Open Data Plans serves as a prime example as to why a full comprehensive inventory of government information is required from open data and transparency legislation going forward. The Smart Chicago Collaborative completed a fantastic index of criminal justice datasets (including lists of what was proactively released, and what could be FOIA-ed for, and lastly what existed, but could not be FOIA-ed for). See more examples of government inventories and indexes on our Open Data Guidelines Examples page.

Explanations as to why datasets that did not appear in the 2013 plan were being listed in 2014

In addition to removed datasets, some new datasets appeared on the updated plan with no explanation as to why had been added or why they were absent from the original 2013 Open Data Plan. Where these datasets decoupled from larger datasets in the original plan? Or were they missed in each agency’s first pass at listing datasets in the Open Data Plan? Or was it something else? An explanation would have been helpful and should have been included in this updated report.

Below is a list of datasets* that did not appear in the 2013 Open Data Plan, but did in the 2014 update, along with their new due dates:

Of note: the open format version of this plan was overrode (with no public access to previous versions). Each plan should not only be released in a PDF, but also on the portal in an open format to facilitate comparative analysis.

Explanations as to why overdue datasets were delayed

Since some time has passed since the 2013 Open Data Plan, many datasets have already been due between then and the 2014 update. Unfortunately, the 2014 updated Open Data Plan did not include explanations as to why these datasets were delayed or as to how the new due dates (some spanning over a year) were determined and/or justified. New York City should include explanations as to why scheduled release dates have not been met and how extensions are determined in all of their Open Data Plan updates. Moreover, when datasets are overdue this information should be released in an up-to-date dataset on the portal for public accountability. (While New York City lists what data has been added proactively on the data portal, it does not list what data is overdue). Below, deduced from the plans, is a complete list of datasets* that are currently overdue according to the 2013 NYC Open Data Plan and did not come up in a quick portal search:

Accountability measures New York City (and all governments) can do to improve their Open Data Plans

We can’t commend New York City enough for its commitment to releasing more open data according to a designated plan and we are excited to see that Montgomery County has followed suit providing an open data publishing schedule and their methodology for how they chose to prioritize release over the next 2+ years. New York City could be executing these open data plans in a more accountable and open way, by:

  • Providing a comprehensive list of information held by each agency, whether or not it qualifies as “data” under the current Local Law 11 definition.
  • Providing explanations for new and overdue datasets, and a public methodology of how datasets were prioritized for release.
  • Providing schedules in open formats, notoriously (i.e. linked next to the PDFs of plans), making changes easily distinguishable.
  • Amending Local Law 11 to reflect these changes in codified law, including an expanded definition of “data” that includes all information that is currently subject to freedom of information requests.

*Please report any data discrepancies/updates to: local@sunlightfoundation.com