The Daily New York

New York news, every day

News

How New York's Digital Archives Became a Maze of Duplicate Images — and Why Fixing It Now Matters

A years-long accumulation of redundant photographs across city agency databases has created a costly storage and retrieval crisis that officials are only beginning to address.

By New York News Desk · Published 4 July 2026, 2:36 pm

4 min read

How New York's Digital Archives Became a Maze of Duplicate Images — and Why Fixing It Now Matters
Photo: Geijsbeek, John B. (John Bart), 1872- Pacioli, Luca, d. ca. 1445-1517 Manzoni, Domenico Pietra, Angelo, 16th century Mainardi, Matteo Christoffels, Jan Ympyn, 16th cent Stevin, Simon, 1548-1620 Dafforne, Richard / Public domain (Wikimedia Commons)

New York City's sprawling network of public-facing digital platforms — spanning more than 40 mayoral agencies, the MTA, the Department of City Planning, and dozens of community portals — is carrying a hidden weight: hundreds of thousands of duplicate images stored across overlapping servers, inflating costs and slowing the retrieval systems that residents and journalists rely on daily. The problem did not appear overnight. It has been building since the early 2010s, when city agencies began digitizing records in parallel, without a unified archiving standard.

The timing of any reckoning matters. New York is in the middle of a $100 billion-plus MTA capital program and is simultaneously gearing up to serve an estimated 5 million visitors for the 2026 FIFA World Cup, with MetLife Stadium in East Rutherford anchoring matches and much of the fan infrastructure routed through Midtown Manhattan. Every city tourism page, transit map, and event-promotion asset published online draws from the same fragmented image libraries that have never been properly deduplicated.

A Problem Built Layer by Layer

The roots trace back to 2011 and 2012, when then-Mayor Bloomberg's administration pushed agencies onto individual content management systems without mandating shared asset libraries. The Department of Transportation, NYC Parks, and the Department of Buildings each stood up their own digital repositories. When de Blasio's administration later built NYC.gov's unified front end, it pulled assets from all of those siloed systems simultaneously — copying rather than linking. Photographers hired for city events would submit images to multiple departments, each of which would upload the same file independently. By the mid-2010s, estimates from city technology staff placed the duplication rate in some archives at above 30 percent, according to internal reviews cited in subsequent budget documents.

The Adams administration inherited this situation in January 2022. The Office of Technology and Innovation, based at 255 Greenwich Street in Lower Manhattan, identified deduplication as a line item in its Fiscal Year 2024 budget proposal, but allocations were trimmed during subsequent rounds of agency cuts. The result: partial cleanup runs were completed on the NYC Open Data portal, which hosts more than 2,900 public datasets including photo archives, but the deeper backend repositories feeding agency websites remained largely untouched through early 2026.

Storage costs are a concrete part of the picture. Cloud infrastructure for city government — managed in part through contracts with vendors operating out of data centers in northern New Jersey — has grown substantially year over year. The city's overall IT expenditure crossed $1.6 billion in Fiscal Year 2025, according to the Mayor's Office of Management and Budget's adopted budget documents. Technology advocates at the nonprofit Reinvent Albany have pointed to redundant data storage as one category where savings could offset other technology investments, though the organization has not published a specific figure for image duplication costs alone.

What Cleanup Actually Looks Like

Deduplication at this scale is not simply a matter of running a script. It requires reconciling metadata standards across agencies, establishing which version of a duplicated image is the authoritative one, and updating thousands of hardcoded links embedded in old web pages. The city's 311 portal, the NYC Planning digital map tools used by developers filing applications in Brooklyn and the Bronx, and the tourism-facing NYC & Company website at 810 Seventh Avenue all draw on image assets that would need to be repointed during any migration.

The World Cup deadline is functioning as an unofficial forcing mechanism. NYC & Company and the Mayor's Office of Media and Entertainment, which coordinates official event imagery, are expected to publish updated visual assets for Cup-related programming through the spring of 2026, and those pipelines depend on clean, accessible archives. Consultants working on the city's digital infrastructure — firms holding contracts that run through the end of calendar year 2026 — are understood to be scoping a phased deduplication project, beginning with the highest-traffic public portals.

For now, the practical advice for journalists, researchers, or developers working with NYC's open image repositories is straightforward: cross-check any downloaded asset against the NYC Open Data portal's most recently updated dataset version, note the upload date, and assume that identical images may carry different file identifiers across different agency pages. The city has published a data dictionary for its Open Data assets at opendata.cityofnewyork.us, which remains the most reliable single entry point while the broader cleanup is underway.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily New York

This article was produced by the The Daily New York editorial desk and covers news in New York. See our editorial standards for how we use AI.

The Daily New York brief

The day's New York news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to New York news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily New York and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily New York

More in News

Enjoyed this story? Get tomorrow's briefing free.