Beyond Inference: Exploratory Perspectives on Leveraging NPUs for Scheduling and Optimization
Keywords:
Neural Processing Unit, Scheduling, Optimization, CPU–NPU Cooperation, Resource ManagementAbstract
Neural Processing Units (NPUs) are beginning to become integral components in modern computing systems, primarily by accelerating inference workloads in areas such as computer vision, natural language processing, and recommendation systems. Despite this increasing prevalence, NPUs remain underutilized in many deployments, often wasting energy while sitting idle outside of inference contexts. At the same time, CPU scheduling continues to face challenges in balancing throughput, latency, efficiency, and energy consumption under highly variable workloads. Current scheduling paradigms primarily focus on balancing computational throughput and minimizing latency. For example, the Linux kernel recently adopted the Earliest Eligible Virtual Deadline First (EEVDF) scheduler to better address latency handling, building on the earlier Completely Fair Scheduler (CFS), which emphasized fairness. Yet, in real-world scenarios, workload variance is often high, and decisions such as prioritizing latency versus throughput versus efficiency are typically made statically, based on expected conditions rather than actual runtime demands. As a result, these scheduling parameters rarely adapt in real time to the effects of dynamic workloads and real-world usage patterns.
Together, these observations highlight an opportunity to explore the potential of NPUs beyond traditional inference acceleration. By leveraging their parallel processing capabilities and low-power characteristics, NPUs could assist or even take a more active role in dynamic scheduling, enabling adaptive resource management that responds to real-time workload demands. This paper presents an exploratory framework for investigating the integration of NPUs into scheduling. Rather than reporting empirical results, it outlines conceptual pathways and architectural possibilities.
Downloads
Published
Conference Proceedings Volume
Section
License
Copyright (c) 2026 DMPedia Lecture Notes in Multidisciplinary Research

This work is licensed under a Creative Commons Attribution 4.0 International License.