Search

Searching. Please wait…

Intelligent energy pairing scheduler (InEPS) for heterogeneous HPC clusters

Abstract: In recent years, energy consumption has become a limiting factor in the evolution of high-performance computing (HPC) clusters in terms of environmental concern and maintenance cost. The computing power of these clusters is increasing, together with the demands of the workloads they execute. A key component in HPC systems is the workload manager, whose operation has a substantial impact on the performance and energy consumption of the clusters. Recent research has employed machine learning techniques to optimise the operation of this component. However, these attempts have focused on homogeneous clusters where all the cores are pooled together and considered equal, disregarding the fact that they are contained in nodes and that they can have different performances. This work presents an intelligent job scheduler based on deep reinforcement learning that focuses on reducing energy consumption of heterogeneous HPC clusters. To this aim it leverages information provided by the users as well as the power consumption specifications of the compute resources of the cluster. The scheduler is evaluated against a set of heuristic algorithms showing that it has potential to give similar results, even in the face of the extra complexity of the heterogeneous cluster.

 Fuente: Journal of Supercomputing, 2025, 81(2), 427

 Publisher: Kluwer Academic Publishers

 Publication date: 01/01/2025

 No. of pages: 23

 Publication type: Article

 DOI: 10.1007/s11227-024-06907-y

 ISSN: 0920-8542,1573-0484

 Spanish project: PID2022-136454NB-C21

 Publication Url: https://doi.org/10.1007/s11227-024-06907-y

Authorship

LÓPEZ, MARTA

ESTEBAN STAFFORD FERNANDEZ