Skip to main content
King Abdullah University of Science and Technology
KAUST
Main navigation
Home
AI accelerator
EdgeCIM: A Hardware-Software Co-Design for CIM-Based Acceleration of Small Language Models
Thu, Sep 25 2025
Research
AI accelerator
The deployment of language models is rapidly shifting from datacenters to edge devices such as laptops, smartphones, and embedded platforms, driven by the demand for interactive, low-latency, and privacy-preserving applications. In this context, Small Language Models (SLMs) have emerged as practical candidates, yet their inference reveals inefficiencies in conventional accelerators. While GPUs and NPUs process the GEMM-heavy prefill stage efficiently, they remain underutilized during the GEMV-dominated decoding phase, resulting in limited throughput and excessive energy consumption at the edge
Efficient AI Across Edge, Near-Edge, and Cloud
Thu, Sep 25 2025
Research
AI accelerator
Modern applications like smart cameras, self-driving cars, and VR devices rely on powerful AI models. Running these models quickly and efficiently across phones, edge devices, and cloud servers is a tough challenge. Our work develops two frameworks to make this possible: DONNA finds the best way to split and run AI models across different types of devices, from traditional CPUs and GPUs to new Compute-In-Memory (CIM) accelerators, so they use less energy while staying fast. HiDist takes the idea further by looking at the whole system: edge devices near the user, stronger near-edge servers, and