Search

Searching. Please wait…

Detalle_Publicacion

LIGERO: A light but efficient router conceived for cache-coherent chip multiprocessors

Abstract: Although abstraction is the best approach to deal with computing system complexity, sometimes implementation details should be considered. Considering on-chip interconnection networks in particular, underestimating the underlying system specificity could have nonnegligible impact on performance, cost, or correctness. This article presents a very efficient router that has been devised to deal with cache-coherent chip multiprocessor particularities in a balanced way. Employing the same principles of packet rotation structures as in the rotary router, we present a router configuration with the following novel features: (1) reduced buffering requirements, (2) optimized pipeline under contentionless conditions, (3) more efficient deadlock avoidance mechanism, and (4) optimized in-order delivery guarantee. Putting it all together, our proposal provides a set of features that no other router, to the best of our knowledge, has achieved previously. These are: (1') low implementation cost, (2') low pass-through latency under low load, (3') improved resource utilization through adaptive routing and a buffering scheme free of head-of-line blocking, (4') guarantee of coherence protocol correctness via end-to-end deadlock avoidance and in-order delivery, and (5') improvement of coherence protocol responsiveness through adaptive in-network multicast support. We conduct a thorough evaluation that includes hardware cost estimation and performance evaluation under a wide spectrum of realistic workloads and coherence protocols. Comparing our proposal with VCTM, an optimized state-of-the-art wormhole router, it requires 50% less area, reduces on-chip cache hierarchy energy delay product on average by 20%, and improves the cache-coherency chip multiprocessor performance under realistic working conditions by up to 20%.

Other publications of the same journal or congress with authors from the University of Cantabria

 Authorship: Abad P., Puente V., Gregorio J.A.,

 Fuente: ACM Transactions on Architecture and Code Optimization, 2013, 9(4), 1-21

Publisher: Association for Computing Machinery (ACM)

 Year of publication: 2013

No. of pages: 21

Publication type: Article

 DOI: 10.1145/2400682.2400696

ISSN: 1544-3566,1544-3973

 Spanish project: TIN2010- 18159

Publication Url: https://doi.org/10.1145/2400682.2400696