Feedback-controlled resource sharing for predictable escience
SM Park, M Humphrey - SC'08: Proceedings of the 2008 ACM …, 2008 - ieeexplore.ieee.org
SM Park, M Humphrey
SC'08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, 2008•ieeexplore.ieee.orgThe emerging class of adaptive, real-time, data-driven applications is a significant problem
for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled
HPC resources to make and guarantee a tightly-bounded prediction regarding the time at
which a newly-submitted application will execute. While a reservation-based approach
partially addresses the problem, it can create severe resource under-utilization (unused
reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource …
for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled
HPC resources to make and guarantee a tightly-bounded prediction regarding the time at
which a newly-submitted application will execute. While a reservation-based approach
partially addresses the problem, it can create severe resource under-utilization (unused
reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource …
The emerging class of adaptive, real-time, data-driven applications is a significant problem for today's HPC systems. In general, it is extremely difficult for queuing-system-controlled HPC resources to make and guarantee a tightly-bounded prediction regarding the time at which a newly-submitted application will execute. While a reservation-based approach partially addresses the problem, it can create severe resource under-utilization (unused reservations, necessary scheduled idle slots, underutilized reservations, etc.) that resource providers are eager to avoid. In contrast, this paper presents a fundamentally different approach to guarantee predictable execution. By creating a virtualized application layer called the performance container, and opportunistically multiplexing concurrent performance containers through the application of formal feedback control theory, we regulate the job's progress such that the job meets its deadline without requiring exclusive access to resources even in the presence of a wide class of unexpected disturbances. Our evaluation using two widely-used applications, WRF and BLAST, on an 8-core server show our approach is predictable and meets deadlines with 3.4 % of errors on average while achieving high overall utilization.
ieeexplore.ieee.org
Showing the best result for this search. See all results