Serving HTC Users in Kubernetes by Leveraging HTCondor
High Throughput Computing (HTC), sometimes also called batch computing, has long been and still is the major workhorse for most R&D organizations. Typical workloads include parameter sweeps, Monte Car …
Talk Title | Serving HTC Users in Kubernetes by Leveraging HTCondor |
Speakers | Igor Sfiligoi (Lead Scientific Software Developer and Researcher, University of California San Diego) |
Conference | KubeCon + CloudNativeCon North America |
Conf Tag | |
Location | San Diego, CA, USA |
Date | Nov 15-21, 2019 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
High Throughput Computing (HTC), sometimes also called batch computing, has long been and still is the major workhorse for most R&D organizations. Typical workloads include parameter sweeps, Monte Carlo simulations and partitionable dataset processing. Kubernetes by itself is not very well suited for such workloads, which are submitted by hundreds of concurrent users and rely on the execution of thousands, or even millions of small tasks. This presentation will provide an overview of how HTCondor, a prominent HTC system, can be used to effectively and efficiently manage such workloads. The author has been running such a system on a Kubernetes cluster operated out of the University of California San Diego, and will share his experience and issues he encountered during that time.