Skip to main content
Department of Information Technology

UART Publications

Efficient Synchronization for Nonuniform Communication Architectures

Zoran Radovic and Erik Hagersten

In Proceedings of Supercomputing 2002 (SC2002), Baltimore, Maryland, November 2002.

Abstract

Scalable parallel computers are often nonuniform communication architectures (NUCAs), where the access time to other processor's caches vary with their physical location. Still, few attempts of exploring cache-to-cache communication locality have been made. This paper introduces a new kind of synchronization primitives (lock-unlock) that favor neighboring processors when a lock is released. This improves the lock handover time as well as access time to the shared data of the critical region. A critical section guarded by our new RH lock takes less than half the time to execute compared with the same critical section guarded by any other lock on our NUCA hardware. The execution time for Raytrace with 28 processors was improved 2.23-4.68 times, while global traffic was dramatically decreased compared with all the other locks. The average execution time was improved 7-24% while the global traffic was decreased 8-28% for an average over the seven applications studied.

Available as PDF (123 kB)

BibTeX file entry: Radovic:2002:nov

Updated  2003-10-15 17:00:23 by Zoran Radovic.