AI Inference Pulls Infrastructure Back into Metro Data Centers

AI Inference Pulls Infrastructure Back Into Metro Data Centers. 4 Min Read. DataVerge. New York City is not where the…
1 Min Read 2

AI Inference Pulls Infrastructure Back Into Metro Data Centers. 4 Min Read. DataVerge. New York City is not where the AI infrastructure boom was supposed to happen.. Over the last two years, the industry’s center of gravity has tilted toward sprawling hyperscale campuses in Northern Virginia, Texas, Utah, and Louisiana – places where operators chase vast tracts of land and power for giant GPU clusters.. A growing share of AI workloads, however, no longer revolves solely around training. It revolves around inference: running models in production for real users, applications, APIs, and automated systems.. “Once a model is in production and revenue-bearing, the bottleneck stops being raw FLOPS and starts being round-trip time, jitter, and egress cost,” said Stephen Sopko, an analyst-in-residence covering semiconductors and deep tech at HyperFrame Research.. That dynamic is beginning to pull portions of AI infrastructure back toward carrier-dense metro facilities sited close to users and network interconnection rather than remote campuses built primarily around power scale.. Related:Urban vs. Rural: Why Data Centers Are Built Where They Are. A recent deployment by the Brooklyn-based colocation provider DataVerge and AI software company Mathpix offers an early example of that transition.. Mathpix said it is deploying Nvidia B300 GPU systems at DataVerge’s facility in Brooklyn to support AI model training and real-time inference workloads for its document processing platform.. From AWS to Owned Infrastructure. Mathpix converts PDFs, handwritten notes, equations, and scanned documents into structured, machine-readable text for enterprise workflows and AI applications. Demand surged as AI developers sought to transform vast PDF archives into training-ready datasets, according to Mathpix founder and CEO Nico Jimenez.. “We started getting a lot of interest from hyperscalers who wanted mass-scale conversion of PDFs,” Jimenez told Data Center Knowledge.. The company’s infrastructure evolved through trial and error. Jimenez said Mathpix initially experimented with desktop GPU systems inside its Williamsburg office before moving into enterprise AI hardware. One early A100 server proved difficult to power, impractical to cool, and too noisy to operate in an office environment.. “We finally turn on the A100 server, and it sounds like a jet engine,” Jimenez said. “Very quickly, we realized this is not going to work.”. Mathpix moved into DataVerge’s facility at Industry City. At first, the deployment focused on training. Over time, Mathpix shifted more of its stack – including inference systems, databases, and deployment tooling – onto owned hardware, citing better performance and economics than comparable cloud infrastructure. “Doing things on the local network just goes way faster,” Jimenez said.. Related:Wholesale vs. Retail Colocation: How to Choose a Data Center Lease. The company still operates a hybrid environment spanning AWS and the Brooklyn deployment, using cloud capacity for burst scaling while shifting persistent workloads and latency-sensitive services on-premises.. Inside the Brooklyn Deployment. DataVerge CEO Ray Sidler said Mathpix initially deployed only a few cabinets while still relying heavily on AWS, but then expanded after seeing faster response times and lower costs running workloads locally. “They saw that they were getting a faster response time, and the data was processing faster in our facility because of the ecosystem and the carrier ecosystem,” Sidler said.. DataVerge’s Industry City site currently supports air-cooled GPU deployments up to 35 kW per cabinet using cold-aisle containment pods and upgraded flooring systems designed for denser AI infrastructure. According to the company, the facility has about 1 MW of remaining capacity and is planning a further 3 MW expansion for 2027, including higher-density AI infrastructure designed specifically for GPU customers. “We started to build 10-cabinet cold aisle containment pods,” Sidler said. “We’re able to put roughly 500 kW of power into those pods.”. Related:A Buyer’s Guide to Data Center Colocation Space. Why Urban Colocation Fits Inference. Over the past two years, the AI land rush prioritized utility-scale power development and warehouse-scale training clusters. Inference workloads often operate under different constraints, prioritizing network proximity, interconnection density, operational flexibility, and predictable latency. That dynamic could create new demand for carrier-dense urban colocation facilities historically associated with enterprise IT, financial trading infrastructure, and interconnection rather than GPU-heavy AI deployments.. Sidler argues the industry has become overly focused on giant hyperscale campuses while underestimating the role smaller metro deployments could play as inference demand grows.. “The local markets and the 5- to 10-megawatt data centers will be the sweet spot,” Sidler said.. Training and Inference Begin to Split Infrastructure. The deployment highlights how the AI infrastructure market may be fragmenting into multiple models.. Sopko said the market increasingly resembles “two distinct infrastructure categories” emerging around training and inference.. “Training looks like centralized AI factories optimized for raw density, power availability, and east-west bandwidth inside the cluster,” Sopko said. “Inference is looking more like a distributed fabric of smaller, network-dense pods sized to serve regional user populations.”. Large frontier-model training still favors enormous, centralized campuses. But some AI workloads – particularly inference systems tied to enterprise applications, APIs, and real-time services – increasingly reward proximity to users and dense network interconnection.. Jimenez said Mathpix eventually moved databases, logging systems, and deployment infrastructure to colocated hardware after discovering major performance and cost advantages over cloud services.. Sidler said similar economics now drive broader customer demand.. “We were losing customers to Amazon,” Sidler said. “Now it became a hybrid model.”. For now, deployments like Mathpix’s in Brooklyn are tiny next to the multi-gigawatt AI campuses. Even Mathpix notes its newest B300 systems currently function primarily as training infrastructure rather than large-scale inference clusters.. Still, the deployment reflects how some AI companies that matured inside AWS are beginning to rebuild portions of their infrastructure stacks around owned metro GPU infrastructure as cloud economics, latency, and operational control become more important inside AI production environments.. About the Author

 

editor

Leave a Reply

Your email address will not be published. Required fields are marked *