The Daily New York

New York news, every day

News

NYC Public Records Digitization Hits Milestone: 40K Duplicates Removed

New York City's municipal archive purge removes 40,000+ duplicate files, speeding up courthouse and agency record searches for researchers and journalists.

By New York News Desk · Published 4 July 2026, 2:39 pm

3 min read

NYC Public Records Digitization Hits Milestone: 40K Duplicates Removed
Photo: Photo by Optical Chemist on Pexels

New York City's Department of Records and Information Services announced this week that it has cleared more than 40,000 duplicate photographic files from the municipal archive's digital catalog — the largest single purge since the agency began its systematic digitization effort in 2021. The cleanup, centered on collections held at the DORIS facility on Chambers Street in Lower Manhattan, removes redundant scans that had been clogging search results for researchers, journalists, and legal teams pulling records through the city's online portal.

The timing matters. With the 2026 FIFA World Cup drawing tens of thousands of visitors to MetLife Stadium and to public venues across the five boroughs, city agencies have been under pressure to modernize public-facing record systems quickly. Duplicate image files — created when batches of photographs were scanned multiple times during overlapping digitization contracts — have frustrated requests ranging from title searches in Brooklyn Housing Court to community board submissions in the Bronx. The backlog had become a recurring complaint at City Hall technology briefings throughout the spring.

What the Problem Actually Looked Like

The duplicate-image issue grew from a contracting gap that opened between 2018 and 2022, when three separate vendors handled different phases of the municipal photo archive migration. Each vendor used different naming conventions, meaning the same physical photograph — say, a 1960s construction image from the Parks Department's Flushing Meadows–Corona Park collection — might appear four or five times under different file identifiers. Staff at the Municipal Archives on Chambers Street had flagged the problem internally, but a full deduplication pass requires computational matching tools that the agency only acquired through a technology grant finalized in March 2025.

The New York Public Library's Milstein Division on 42nd Street, which holds a parallel collection of city photographs donated through separate channels, encountered similar issues when it attempted to cross-reference its holdings with DORIS records earlier this year. Librarians there had to manually flag hundreds of matches before the automated tool was available. That kind of duplicated labor — both institutional and human — is precisely what the Chambers Street cleanup is designed to prevent going forward.

According to budget documents filed with the City Council in April 2026, the deduplication project was allocated $1.2 million from the city's Fiscal Year 2026 technology modernization fund. The scope covers approximately 2.3 million digitized images across 14 agency collections, with the DORIS photo holdings representing the first completed phase. The remaining collections, which include records from the Department of Buildings and the Landmarks Preservation Commission, are scheduled for completion by December 31, 2026.

What Researchers and the Public Can Expect Next

Starting July 7, users of the NYC Municipal Archives online portal will see cleaned search returns for the photograph collections already processed. A search for images tied to, say, the old Domino Sugar Refinery in Williamsburg — a popular subject among urban historians and architects — had previously returned duplicate clusters that required manual sorting. Those searches should now surface distinct, individually catalogued results.

The Buildings Department collection, which legal firms along Fulton Street in Downtown Brooklyn use heavily for property dispute documentation, is next in the processing queue. That phase is expected to begin in August. The Landmarks Preservation Commission records, covering more than 38,000 designated properties citywide, follow in October.

For anyone who relies on the city's digital archive — from graduate students at the CUNY Graduate Center on 34th Street to attorneys filing in State Supreme Court at 60 Centre Street — the practical advice right now is straightforward: re-run any searches you conducted before July 1 if you were getting cluttered or redundant results. The cleaned catalog should return more precise matches. DORIS also asks that users who find remaining duplicates submit a correction flag through the portal's feedback form, since automated matching tools still carry an estimated 3 to 5 percent error rate on low-resolution historical scans.

The full deduplication of all 14 collections, if completed on schedule, would represent the most comprehensive cleanup of New York's public image records since the analog-to-digital migration began over a decade ago.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily New York

This article was produced by the The Daily New York editorial desk and covers news in New York. See our editorial standards for how we use AI.

The Daily New York brief

The day's New York news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to New York news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily New York

More in News

Enjoyed this story? Get tomorrow's briefing free.