The Daily New York

New York news, every day

News

NYC's War on Duplicate Images in Public Records: What Officials, Experts, and Key Figures Are Saying

As the city's digital records backlog swells, archivists and technologists are pressing New York agencies to overhaul how they store and surface government images.

By New York News Desk · Published 4 July 2026, 4:13 pm

3 min read

NYC's War on Duplicate Images in Public Records: What Officials, Experts, and Key Figures Are Saying
Photo: US FBI / Public domain (Wikimedia Commons)

New York City's municipal archives are drowning in duplicate image files — and the people responsible for fixing the problem say the window for a clean solution is closing fast. Across agencies from the Department of Buildings to the Department of City Planning, records managers have spent much of 2026 flagging a growing crisis: tens of thousands of scanned photographs, permits, and survey maps exist in multiple redundant copies, clogging storage systems and slowing public-records searches to a crawl.

The timing matters. With the 2026 FIFA World Cup drawing millions of visitors to MetLife Stadium in East Rutherford and fan zones across Manhattan, city agencies have been racing to digitize and streamline permit and event-licensing records. That pressure has exposed just how messy the underlying data infrastructure really is. Officials at the Department of Information Technology and Telecommunications — known internally as DoITT — have reportedly been briefed on the scope of the duplication problem, though the agency has not issued a formal statement.

What the Experts Are Saying

Digital preservation specialists at the New York Public Library's Stephen A. Schwarzman Building on Fifth Avenue have been vocal, at least in conference settings, about the risks of letting duplicate image files compound over time. The core argument is straightforward: when multiple versions of the same document exist in a system, retrieval errors multiply, legal discoverability becomes murky, and storage costs climb without any corresponding benefit to the public. The Municipal Art Society, which has long monitored how city planning records are maintained, flagged related concerns in a 2025 report on the transparency of land-use data.

At City Hall, the Adams administration has not announced a dedicated program to address image duplication specifically. But the Mayor's Office of Technology and Innovation, which absorbed several DoITT functions under a 2023 reorganization, has indicated that a broader data-quality initiative is underway. No launch date has been made public.

Cornell Tech, the applied research campus on Roosevelt Island, has been in conversation with at least two city agencies about deploying perceptual hashing — a technique that identifies visually identical or near-identical images even when file names differ — as part of a pilot deduplication effort. The conversations, described in a publicly available grant application filed with the National Science Foundation in March 2026, suggest the city is exploring outside partnerships rather than building solutions in-house.

The Cost of Inaction

Storage is not cheap. Commercial cloud storage for large-scale image repositories runs between $0.02 and $0.05 per gigabyte per month depending on retrieval frequency, and city agencies collectively manage petabyte-scale archives. Even a modest reduction in duplicate files — analysts in similar municipal contexts have cited figures of 20 to 30 percent redundancy in unmanaged image databases — could translate to meaningful budget savings over a multi-year contract cycle.

The Department of Records and Information Services, which operates the city's official archive at 31 Chambers Street in Lower Manhattan, declined to provide specific duplication figures when contacted Friday. A spokesperson said the agency was not in a position to comment on ongoing internal reviews.

Community groups in neighborhoods like Sunset Park and the South Bronx, where residents frequently submit Freedom of Information Law requests for building inspection photos and code-violation records, say search results regularly surface the same image multiple times. That's not a minor inconvenience — it slows down tenant advocacy, delays environmental reviews, and buries the records that matter under ones that don't.

For anyone watching this issue, the next concrete milestone is a DoITT budget hearing scheduled before the City Council's Technology Committee in September 2026. That session is expected to include testimony on data-infrastructure priorities for fiscal year 2027. Advocates say it is the most realistic near-term venue to push for a formal city policy on image deduplication — and to demand that whatever solution emerges is applied uniformly across agencies, not left to individual departments to solve piecemeal.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily New York

This article was produced by the The Daily New York editorial desk and covers news in New York. See our editorial standards for how we use AI.

The Daily New York brief

The day's New York news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to New York news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily New York

More in News

Enjoyed this story? Get tomorrow's briefing free.