The Daily New York

New York news, every day

News

New York Is Quietly Reckoning With Duplicate Images in Its Public Record. Other Cities Are Already Years Ahead.

As digital archives expand and AI tools make copy-detection cheaper, cities from Amsterdam to Seoul have built systematic programs to scrub redundant imagery from public databases — while New York is still figuring out who owns the problem.

By New York News Desk · Published 4 July 2026, 2:36 pm

3 min read

New York Is Quietly Reckoning With Duplicate Images in Its Public Record. Other Cities Are Already Years Ahead.
Photo: Porter, Rose, 1845-1906 / Public domain (Wikimedia Commons)

The City of New York holds tens of millions of photographs — in Department of Buildings permit files, NYPD evidence archives, Parks Department documentation, and the vast back-end systems of NYC Open Data, the public portal launched in 2012. A growing number of records managers, archivists, and open-government advocates say a quieter crisis sits inside those databases: duplicate images, sometimes thousands of near-identical frames filed under different case numbers or property records, bloating storage costs and muddying public records searches. Nobody at City Hall has publicly put a dollar figure on the problem. Nobody has formally been told to fix it.

The issue lands with particular urgency right now because New York is hosting FIFA World Cup matches at MetLife Stadium across the Hudson this summer, and city agencies have been racing to digitize and cross-reference venue documentation, security imagery, and infrastructure records at a pace they haven't attempted before. The collision of a legacy filing culture with a sudden demand for clean, searchable data has exposed seams that administrators have long papered over.

What Other Cities Have Built — and What New York Hasn't

Amsterdam's municipal archive, the Stadsarchief, completed a two-year duplicate-detection overhaul in 2024, applying perceptual hashing — a technique that generates a fingerprint for each image and flags near-matches — across more than 900,000 historical photographs. The project cut redundant storage by roughly 30 percent, according to a summary the archive published on its website. Seoul's Smart City Division, which sits inside the Seoul Metropolitan Government, has embedded similar deduplication pipelines directly into its CCTV footage management system since 2023, flagging repeated frames before they are written to long-term storage. London's Metropolitan Police announced a comparable pilot for body-worn camera footage in late 2024, though the rollout has faced scrutiny from civil liberties groups.

New York has no equivalent centralized program. The Department of Information Technology and Telecommunications, known as DoITT and rebranded as NYC Cyber Command for security functions, handles network infrastructure but does not operate a citywide image-deduplication service. The Municipal Archives on Chambers Street in Lower Manhattan — which holds the city's historical photographic record dating to the nineteenth century — has digitized roughly 900,000 images and made them publicly searchable, but staff there have said publicly, in presentations at the Society of American Archivists, that deduplication is handled manually and inconsistently across collections.

At the borough level, the picture is patchier still. Brooklyn's Department of Buildings office processed more than 140,000 permit applications in 2024 alone, each potentially carrying multiple photo attachments. There is no published standard for how those images are deduplicated before storage, and the department did not respond to a request for clarification before deadline.

Why It Costs More Than People Think

Cloud storage is cheap until it isn't. New York City signed a contract with Microsoft Azure in 2021, the terms of which are partially available through the city's procurement database, to support expanding agency cloud needs. Storage costs scale with volume, and duplicated imagery is pure volume with no informational return. Municipal archivists at institutions including the New York Public Library on Fifth Avenue and 42nd Street — which partners with city agencies on some digitization projects — have flagged the redundancy problem in professional settings, though the library itself is not a city agency and manages its own collections separately.

The practical consequence for ordinary New Yorkers is slower, noisier search results when pulling property records in neighborhoods like Bushwick or Mott Haven, where rapid development has generated dense filing activity. A contractor pulling permit photos for a building on Knickerbocker Avenue may wade through dozens of near-identical site images filed at different inspection stages.

Cities that have tackled this — Amsterdam and Seoul chief among them — started by designating a single agency as responsible, then funded a one-time audit before automating ongoing deduplication. New York's next budget cycle, with the fiscal year 2027 process beginning in earnest this fall, will offer the Adams administration or its successor an opening to fund exactly that kind of audit. Without it, the databases will keep growing, the redundancy will compound, and the search problem will get worse before any election cycle forces it to the surface.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily New York

This article was produced by the The Daily New York editorial desk and covers news in New York. See our editorial standards for how we use AI.

The Daily New York brief

The day's New York news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to New York news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily New York

More in News

Enjoyed this story? Get tomorrow's briefing free.