During the past few years, the main focus of conversations regarding AI infrastructure was primarily on training clusters. The main emphasis was on significant models, substantial GPU clusters, closely knit scale-out networks, and the immense synchronization requirements brought about by collaborative communication among a multitude of accelerators.
The fiber arrangement mirrored these priorities through extended optical spans across buildings and campuses, high-capacity north-south traffic, and highly dense interconnection setups. The significance of that model persists. Nevertheless, deployment tendencies in 2026 are leaning towards a different direction.
A key development in AI during the past year is that inference has surpassed training as the main operational task. In essence, there is a greater amount of computational resources being utilized for applying models rather than creating them. This signifies the evolution of AI from a primarily research-focused domain to an operational one.
The introduction of inference brings about a distinct shift in the infrastructure’s behavior. Over the past few months, I’ve noticed that the discourse on infrastructure has not completely aligned with the practical aspects occurring within AI systems. Many of the discussions in public are still focused on topics such as the number of accelerators, power usage, and large-scale training clusters for hyperscale computing.
Much less attention was paid to the practical implications for optical infrastructure, pathway allocation, topology planning, and physical network architecture. Consequently, my colleagues and I started working on a white paper series that concentrates on the connection between AI workload behavior and physical infrastructure design.
