{"id":1854,"date":"2026-05-22T11:55:19","date_gmt":"2026-05-22T11:55:19","guid":{"rendered":"https:\/\/trustedainews.com\/?p=1854"},"modified":"2026-05-22T11:55:19","modified_gmt":"2026-05-22T11:55:19","slug":"ai-inference-pulls-infrastructure-back-into-metro-data-centers","status":"publish","type":"post","link":"https:\/\/trustedainews.com\/?p=1854","title":{"rendered":"AI Inference Pulls Infrastructure Back into Metro Data Centers"},"content":{"rendered":"<p>AI Inference Pulls Infrastructure Back Into Metro Data Centers. 4 Min Read. DataVerge. New York City is not where the AI infrastructure boom was supposed to happen.. Over the last two years, the industry\u2019s center of gravity has tilted toward sprawling hyperscale campuses in Northern Virginia, Texas, Utah, and Louisiana \u2013 places where operators chase vast tracts of land and power for giant GPU clusters.. A growing share of AI workloads, however, no longer revolves solely around training. It revolves around inference: running models in production for real users, applications, APIs, and automated systems.. \u201cOnce a model is in production and revenue-bearing, the bottleneck stops being raw FLOPS and starts being round-trip time, jitter, and egress cost,\u201d said Stephen Sopko, an analyst-in-residence covering semiconductors and deep tech at HyperFrame Research.. That dynamic is beginning to pull portions of AI infrastructure back toward carrier-dense metro facilities sited close to users and network interconnection rather than remote campuses built primarily around power scale.. Related:Urban vs. Rural: Why Data Centers Are Built Where They Are. A recent deployment by the Brooklyn-based colocation provider DataVerge and AI software company Mathpix offers an early example of that transition.. Mathpix said it is deploying Nvidia B300 GPU systems at DataVerge\u2019s facility in Brooklyn to support AI model training and real-time inference workloads for its document processing platform.. From AWS to Owned Infrastructure. Mathpix converts PDFs, handwritten notes, equations, and scanned documents into structured, machine-readable text for enterprise workflows and AI applications. Demand surged as AI developers sought to transform vast PDF archives into training-ready datasets, according to Mathpix founder and CEO Nico Jimenez.. \u201cWe started getting a lot of interest from hyperscalers who wanted mass-scale conversion of PDFs,\u201d Jimenez told Data Center Knowledge.. The company\u2019s infrastructure evolved through trial and error. Jimenez said Mathpix initially experimented with desktop GPU systems inside its Williamsburg office before moving into enterprise AI hardware. One early A100 server proved difficult to power, impractical to cool, and too noisy to operate in an office environment.. \u201cWe finally turn on the A100 server, and it sounds like a jet engine,\u201d Jimenez said. \u201cVery quickly, we realized this is not going to work.\u201d. Mathpix moved into DataVerge\u2019s facility at Industry City. At first, the deployment focused on training. Over time, Mathpix shifted more of its stack \u2013 including inference systems, databases, and deployment tooling \u2013 onto owned hardware, citing better performance and economics than comparable cloud infrastructure. \u201cDoing things on the local network just goes way faster,\u201d Jimenez said.. Related:Wholesale vs. Retail Colocation: How to Choose a Data Center Lease. The company still operates a hybrid environment spanning AWS and the Brooklyn deployment, using cloud capacity for burst scaling while shifting persistent workloads and latency-sensitive services on-premises.. Inside the Brooklyn Deployment. DataVerge CEO Ray Sidler said Mathpix initially deployed only a few cabinets while still relying heavily on AWS, but then expanded after seeing faster response times and lower costs running workloads locally. \u201cThey saw that they were getting a faster response time, and the data was processing faster in our facility because of the ecosystem and the carrier ecosystem,\u201d Sidler said.. DataVerge&#8217;s Industry City site currently supports air-cooled GPU deployments up to 35 kW per cabinet using cold-aisle containment pods and upgraded flooring systems designed for denser AI infrastructure. According to the company, the facility has about 1 MW of remaining capacity and is planning a further 3 MW expansion for 2027, including higher-density AI infrastructure designed specifically for GPU customers. \u201cWe started to build 10-cabinet cold aisle containment pods,\u201d Sidler said. \u201cWe\u2019re able to put roughly 500 kW of power into those pods.\u201d. Related:A Buyer\u2019s Guide to Data Center Colocation Space. Why Urban Colocation Fits Inference. Over the past two years, the AI land rush prioritized utility-scale power development and warehouse-scale training clusters. Inference workloads often operate under different constraints, prioritizing network proximity, interconnection density, operational flexibility, and predictable latency. That dynamic could create new demand for carrier-dense urban colocation facilities historically associated with enterprise IT, financial trading infrastructure, and interconnection rather than GPU-heavy AI deployments.. Sidler argues the industry has become overly focused on giant hyperscale campuses while underestimating the role smaller metro deployments could play as inference demand grows.. \u201cThe local markets and the 5- to 10-megawatt data centers will be the sweet spot,\u201d Sidler said.. Training and Inference Begin to Split Infrastructure. The deployment highlights how the AI infrastructure market may be fragmenting into multiple models.. Sopko said the market increasingly resembles \u201ctwo distinct infrastructure categories\u201d emerging around training and inference.. \u201cTraining looks like centralized AI factories optimized for raw density, power availability, and east-west bandwidth inside the cluster,\u201d Sopko said. \u201cInference is looking more like a distributed fabric of smaller, network-dense pods sized to serve regional user populations.\u201d. Large frontier-model training still favors enormous, centralized campuses. But some AI workloads \u2013 particularly inference systems tied to enterprise applications, APIs, and real-time services \u2013 increasingly reward proximity to users and dense network interconnection.. Jimenez said Mathpix eventually moved databases, logging systems, and deployment infrastructure to colocated hardware after discovering major performance and cost advantages over cloud services.. Sidler said similar economics now drive broader customer demand.. \u201cWe were losing customers to Amazon,\u201d Sidler said. \u201cNow it became a hybrid model.\u201d. For now, deployments like Mathpix&#8217;s in Brooklyn are tiny next to the multi-gigawatt AI campuses. Even Mathpix notes its newest B300 systems currently function primarily as training infrastructure rather than large-scale inference clusters.. Still, the deployment reflects how some AI companies that matured inside AWS are beginning to rebuild portions of their infrastructure stacks around owned metro GPU infrastructure as cloud economics, latency, and operational control become more important inside AI production environments.. About the Author<\/p>\n<p>\u00a0<\/p>","protected":false},"excerpt":{"rendered":"<p>AI Inference Pulls Infrastructure Back Into Metro Data Centers. 4 Min Read. DataVerge. New York City is not where the&hellip;<\/p>\n","protected":false},"author":2,"featured_media":401,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[],"class_list":["post-1854","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-center"],"_links":{"self":[{"href":"https:\/\/trustedainews.com\/index.php?rest_route=\/wp\/v2\/posts\/1854","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/trustedainews.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/trustedainews.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/trustedainews.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/trustedainews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1854"}],"version-history":[{"count":0,"href":"https:\/\/trustedainews.com\/index.php?rest_route=\/wp\/v2\/posts\/1854\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/trustedainews.com\/index.php?rest_route=\/wp\/v2\/media\/401"}],"wp:attachment":[{"href":"https:\/\/trustedainews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1854"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/trustedainews.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1854"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/trustedainews.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1854"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}