Buscar

Estamos realizando la búsqueda. Por favor, espere...

REX: a remote execution model for continuos scalability in multi-chiplet-module GPUs

Abstract: Monolithic GPU architectures face growing limitations due to power density, yield issues, and manufacturing complexity, motivating a shift toward multi-chiplet designs. While promising, these architectures struggle with workloads exhibiting irregular memory access patterns, where static data placement is often insufficient. Though data locality can help, it does not adapt well to dynamic access behaviour, leading to performance degradation. This paper introduces REX, a runtime mechanism that migrates threads to the chiplet where their data resides, adapting dynamically to the generated memory access patterns with a fine granularity. By relocating computation instead of data, REX improves locality and minimises remote memory accesses, which are especially costly in multi-chiplet environments. As a result, it reduces inter-chiplet traffic and scales efficiently with the number of chiplets. On irregular workloads, the solution demonstrates consistent performance gains, averaging a 13 % speedup, with improvements reaching up to 38 %. Moreover, its scalability with chiplet count is particularly noteworthy, delivering a 25 % average gain, and peaking at an impressive 84 % in the most favourable scenarios.

 Fuente: Future Generation Computer Systems, 2026, 178, 108268

 Editorial: Elsevier

 Fecha de publicación: 01/05/2026

 Nº de páginas: 16

 Tipo de publicación: Artículo de Revista

 DOI: 10.1016/j.future.2025.108268

 ISSN: 0167-739X,1872-7115

 Proyecto español: PID2022-136454NB-C21

 Url de la publicación: https://doi.org/10.1016/j.future.2025.108268