AI Energy Gap and Chiplets: Why Data Movement Matters

At the recent Chiplet Summit 2026 preconference tutorial, the panel session, “Best Way to Make Chiplets Work,” brought together leaders from across the semiconductor ecosystem to tackle one of the most pressing challenges in advanced system design: how do we make heterogeneous, multi-die systems operate as a cohesive, energy-efficient whole for AI?

While much discussion focused on standards such as UCIe and evolving interconnect specifications, a consistent theme emerged: connectivity at the physical layer is necessary, but insufficient. Making AI chiplets truly work efficiently demands clarity at the architectural and semantic layers, where data movement, coherency, and system behavior are defined.

From an Arteris perspective, this distinction is critical.

Beyond bits: Language matters

One of the central insights discussed was that UCIe provides a means to move bits from one die to another, but it does not define the meaning of those bits. As Ashley Stevens of Arteris emphasized, the physical link is only part of the solution. The real challenge lies in defining how chiplets communicate; what protocol is appropriate, what semantics are shared, and how capabilities are negotiated.

In some systems, particularly scale-up environments such as multi-core processing, a coherent interconnect spanning chiplets is required. In other scenarios, for example AI data movement workloads, coherent protocols, even when used for non-coherent communication, impose unnecessary energy overhead with packetization and protocol taking up to 50% of overall bandwidth. In such cases, die-to-die remote direct memory access (RDMA) techniques or long-burst transfers can be dramatically more efficient by maximizing the data-to-overhead ratio and therefore minimizing energy per useful data bit moved.

To read the full article on Semiconductor Engineering, click here.