Google Scholar

Articles

Scholar

My profile My library

[CITATION][C] Themis: Fair and efficient GPU cluster scheduling for machine learning workloads

K Mahajan, A Singhvi, A Balasubramanian, V Batra… - arXiv preprint arXiv …, 2019

Save Cite Cited by 15 Related articles

[PDF] usenix.org

Themis: Fair and efficient {GPU} cluster scheduling

K Mahajan, A Balasubramanian, A Singhvi… - … USENIX Symposium on …, 2020 - usenix.org

Modern distributed machine learning (ML) training workloads benefit significantly from
leveraging GPUs. However, significant contention ensues when multiple such workloads are
run atop a shared cluster of GPUs. A key question is how to fairly apportion GPUs across
workloads. We find that established cluster scheduling disciplines are a poor fit because of
ML workloads' unique attributes: ML jobs have long-running tasks that need to be gang-
scheduled, and their performance is sensitive to tasks' relative placement.

Save Cite Cited by 250 Related articles All 17 versions View as HTML

Showing the best results for this search. See all results

Cite

Advanced search

Saved to My library

[CITATION][C] Themis: Fair and efficient GPU cluster scheduling for machine learning workloads

Themis: Fair and efficient {GPU} cluster scheduling