Uppsala Architecture Research Team
Reducing Cache Pollution Through Detection and Elimination of Non-Temporal Memory Accesses
Contention for shared cache resources has been recognized as a major bottleneck for multicores-especially for mixed workloads of independent applications. While most modern processors implement instructions to manage caches, these instructions are largely unused due to a lack of understanding of how to best leverage them.
This paper introduces a classification of applications into four cache usage categories. We discuss how applications from different categories affect each other's performance indirectly through cache sharing and devise a scheme to optimize such sharing. We also propose a low-overhead method to automatically find the best per-instruction cache management policy.
We demonstrate how the indirect cache-sharing effects of mixed workloads can be tamed by automatically altering some instructions to better manage cache resources. Practical experiments demonstrate that our software-only method can improve application performance up to 35% on x86 multicore hardware.
Reductions in cache miss ratio from changing cache polluting instructions to non-temporal accesses. Poster
-
Reducing Cache Pollution Through Detection and Elimination of Non-Temporal Memory Accesses
. In Proc. International Conference for High Performance Computing, Networking, Storage and Analysis: SC 2010, p 11, IEEE, Piscataway, NJ, 2010. (DOI
, fulltext:print
).