AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving By Zhongkai Yu, University of California April 30, 2026
Exploring the Efficiency of 3D-Stacked AI Chip Architecture for LLM Inference with Voxel By Yiqi Liu, University of Illinois Urbana-Champaign April 30, 2026
Epoxy Composites Reinforced with Long Al₂O₃ Nanowires for Enhanced Thermal Management in Advanced Semiconductor Packaging By Zihao Lin, Georgia Institute of Technology April 29, 2026
Chipmunq: A Fault-Tolerant Compiler for Chiplet Quantum Architectures By Peter Wegmann, Technical University of Munich April 29, 2026
Cross Waveguide Design for Color-Centers in Diamond for Photonic Quantum Computing By Alessio Miranda, Delft University of Technology April 27, 2026
CHICO-Agent: An LLM Agent for the Cross-layer Optimization of 2.5D and 3D Chiplet-based Systems By Qihang Wu, Arizona State University April 22, 2026
A PPA-Driven 3D-IC Partitioning Selection Framework with Surrogate Models By Shang Wang, University of Alberta April 22, 2026
Fleet: Hierarchical Task-based Abstraction for Megakernels on Multi-Die GPUs By Sangeeta Chowdhary, AMD Research April 21, 2026
ChipLight: Cross-Layer Optimization of Chiplet Design with Optical Interconnects for LLM Training By Kangbo Bai, Peking University April 20, 2026
ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving By Yuseon Choi, KAIST April 17, 2026
Technology solutions targeting the performance of gen-AI inference in resource constrained platforms By Joyjit Kundu, IMEC April 15, 2026
Rethinking Compute Substrates for 3D-Stacked Near-Memory LLMDecoding: Microarchitecture–Scheduling Co-Design By Chenyang Ai, University of Edinburgh April 8, 2026
DeepStack: Scalable and Accurate Design Space Exploration for Distributed 3D-Stacked AI Accelerators By Zhiwen Mo, Imperial College London April 7, 2026
Mapping Space Exploration for Multi-Chiplet Accelerators Targeting LLM Inference Serving Workloads By Boyu Li, University of Science and Technology of China April 3, 2026
3D optoelectronics and co-packaged optics: when solving the wrong problems stalls deployment By Yasha Yi, University of Wisconsin April 1, 2026
Expert Streaming: Accelerating Low-Batch MoE Inference via Multi-chiplet Architecture and Dynamic Expert Trajectory Scheduling By Songchen Ma, AI Chip Center for Emerging Smart Systems March 31, 2026
WarPGNN: A Parametric Thermal Warpage Analysis Framework with Physics-aware Graph Neural Network By Haotian Lu, University of California March 26, 2026
DUET: Disaggregated Hybrid Mamba-Transformer LLMs with Prefill and Decode-Specific Packages By Alish Kanani, University of Wisconsin–Madison March 25, 2026