The Daily New York

New York news, every day

News

NYC's Digital Archives Are Full of Duplicate Images. Officials and Experts Say That's a Growing Problem.

As the city accelerates its push to digitize public records, a quiet crisis of redundant, misidentified, and duplicated images is undermining data integrity across agencies from Brooklyn to the Bronx.

By New York News Desk · Published 4 July 2026, 3:57 pm

4 min read

NYC's Digital Archives Are Full of Duplicate Images. Officials and Experts Say That's a Growing Problem.
Photo: Congressional Research Service / Public domain (Wikimedia Commons)

New York City's sprawling effort to bring its paper-era archives into the digital age has produced an unexpected headache: tens of thousands of duplicate images clogging agency servers, muddying public records searches, and costing taxpayers in storage and staff time. The problem has drawn attention from technology officers inside City Hall, librarians at the Brooklyn Public Library, and archivists at the Municipal Archives on Chambers Street — all of whom say the issue demands a coordinated fix before it compounds further.

The timing matters. The city has been accelerating its digitization push since at least 2023, when the Adams administration folded several open-data initiatives under the Office of Technology and Innovation. Agencies including the Department of City Planning and the Department of Buildings have been uploading historical permit photos, survey images, and inspection records at an increasing clip. Without a unified deduplication standard, those uploads have created a pileup: the same building facade photograph filed under different permit numbers, or the same infrastructure image tagged to two separate inspection reports.

What the Experts Are Saying

Archivists at the New York City Municipal Archives, which holds more than 2.2 million photographs dating back to the 19th century, have been grappling with the duplicate-image problem for several years. The Archives, located in the Surrogate's Court building on Chambers Street in Lower Manhattan, began a systematic digitization project in 2019. Staff there have described the challenge of inherited duplicates — scanned images that arrived in batches from other agencies and were filed redundantly before any matching protocols were in place.

Technology consultants who work with municipal governments say the root cause is almost always the same: agencies digitize independently, without a shared metadata schema, and duplicates accumulate at every handoff point. The solution most often cited in archival circles involves hash-based deduplication software — tools that generate a unique fingerprint for each image file and flag identical or near-identical copies automatically. The New York Public Library, which maintains the Schomburg Center for Research in Black Culture in Harlem as well as its flagship building on Fifth Avenue and 42nd Street, uses a version of this approach for its own digitized collections.

The City University of New York's Graduate School of Library and Information Studies, based at Queens College in Flushing, has produced coursework specifically on digital asset management for municipal collections. Faculty there have pointed out that duplicate images are not merely a storage nuisance — they introduce errors into search results, skew statistical analyses of public records, and can compromise legal proceedings that depend on photographic evidence tied to specific permit or inspection numbers.

What City Hall Is — and Isn't — Doing

The Office of Technology and Innovation, which consolidated several digital-government functions under the Adams administration, has not publicly released a citywide deduplication policy as of July 4, 2026. The Department of City Planning updated its data governance framework in late 2024, but that update focused primarily on geographic information system layers rather than photographic assets. The Department of Buildings, which maintains one of the largest image repositories in city government due to its inspection and violation records, declined multiple requests for comment on its deduplication practices last month.

Storage costs are not trivial. Commercial cloud storage for municipal governments typically runs between $0.02 and $0.05 per gigabyte per month, and city agencies collectively maintain petabyte-scale archives. Redundant image files can inflate that footprint by an estimated 20 to 35 percent, according to figures published by the Digital Preservation Coalition, a UK-based nonprofit whose membership includes several major American institutions.

Advocates at the Reinvent Albany government reform group have been pushing the city to publish a comprehensive digital asset inventory since 2025, arguing that taxpayers deserve transparency about what is being stored, at what cost, and how accurately. That request remains unanswered.

For New Yorkers who rely on public-records searches — journalists, lawyers, historians, community boards reviewing development proposals — the practical advice from archivists is to cross-reference any image retrieved from a city database against at least one secondary source, particularly for records generated before 2020. The Municipal Archives reading room on Chambers Street is open to the public by appointment and can assist with verification. City officials say a broader data governance update is expected by the end of fiscal year 2027 — though no formal timeline has been published.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily New York

This article was produced by the The Daily New York editorial desk and covers news in New York. See our editorial standards for how we use AI.

The Daily New York brief

The day's New York news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to New York news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily New York

More in News

Enjoyed this story? Get tomorrow's briefing free.