A Novel Approach to Cache Block Reuse Predictions.

This work introduces a novel refresh mechanism that leverages reuse information to decide which blocks should be refreshed in an energy-aware eDRAM last-level cache. ... Experimental results show that, compared to a conventional eDRAM cache, the energy-aware approach achieves refresh energy savings up to 71%, while the reduction on the overall dynamic energy is by 65% ... The devised refresh policy exploits reuse information to decide whether a cache block should be refreshed. ...

doi:10.1145/2464996.2467278 dblp:conf/ics/ValeroSPD13 fatcat:pu2tpvlxrffmjmrdqghlniz5e4

Acknowledgment The authors would like to thank the members of the ALF team for their feedback on this work. ... Problem: Prior studies [4, 3, 12, 1, 5, 2] have proposed novel approaches to predict the reuse behavior of applications and, hence their ability to utilize the cache. ... Several prior mechanisms [3, 1, 12, 5, 2] have used this approach: the general goal being to assign cache space (not explicitly but by reuse prediction) to applications that could utilize the cache better ...

doi:10.1016/j.jpdc.2017.02.004 fatcat:24hmq6ycqnhhxcmyt2hzaxrmpu

On a predicted last touch, the referenced cache block is marked for early eviction. ... This permits cache blocks lower in the LRU stack-but with shorter reuse distances-to remain in cache longer, resulting in additional cache hits. ... Acknowledgements The authors would like to thank Xuanhua Li, Hameed Badawy, Steve Crago, Vida Kianzad, Seungryul Choi, Inseok Choi, Aamer Jaleel, Janice McMahon, Priyanka Rajkhowa, and Meng-Ju Wu for helpful ...

dblp:journals/jilp/LiuY09 fatcat:eqf5mahskrbmbkwfboo5xaab2a

In this work, we propose a learning-aided approach to predict future data accesses. ... We find that a powerful LSTM-based recurrent neural network model can provide high prediction accuracy based on only a cache trace as input. ... This paper strives to predict forward reuse distance based on only the past cache trace by applying a novel learningaided approach. ...

arXiv:2007.15859v1 fatcat:qn5topsvfrhvzc4dqavkm656oy

We propose a monitoring mechanism that dynamically samples cache sets to estimate the Footprint-number of applications and classifies them into discrete (distinct and more than two) priority buckets. ... The cache replacement policy leverages this classification and assigns priorities to cache lines of applications during cache replacement operations. ... ACKNOWLEDGMENT The authors would like to thank the members of the ALF team for their suggestions. This work is partially supported by ERC Advanced Grant DAL No. 267175. ...

doi:10.1109/ipdps.2016.30 dblp:conf/ipps/SridharanS16 fatcat:bq2i7o7kn5be7jkhda424erpwe

To solve this problem, we have developed a novel architecture and a WCET analysis framework for this architecture. ... Our work classifies predictable and unpredictable accesses and allocates them into predictable caches and unpredictable caches respectively, using the CME (Cache Miss Equations) and reuse-distance based ... Predictable Cache Architectures Cache partitioning [8] is a mechanism developed to reserve blocks of cache for individual tasks such that cache hits becomes predictable. ...

doi:10.1109/icecs.2008.4674877 dblp:conf/icecsys/LiFYCHT08 fatcat:b4ou52jjlzhrlngkqucczgtsxy

access patterns that it uses to build memory reuse distance distribution models for each basic block, (iii) runs detailed basic-block level simulations to determine hardware pipeline usage. ... We analyze the application of multi-variate regression models that accurately predict the reuse profiles and the basic block counts. ... The goal of the AMMP approach is to predict the performance of a software on a target architecture. ...

arXiv:2010.04212v2 fatcat:53bor5hw5zgxpp5feymwg5imsy

Multiple Versions

KPC cache management has three novel contributions. First, a prefetcher which approximates the future use distance of prefetch requests based on its prediction confidence. ... Finally, KPC removes the need to propagate the PC through entire on-chip cache hierarchy while providing a holistic cache management approach with better performance than state-of-the-art PC-, and non-PC-based ... Blocks predicted to have high reuse will be inserted with a high priority, i.e., at the MRU position, blocks predicted to have low reuse will be inserted in a low priority position such as the LRU position ...

doi:10.1145/3093336.3037701 fatcat:d2twfr4cuzcwpcgaytopglk2eu

Once identified, dead blocks are evicted from LLC to make space for potentially high reuse cache blocks. ... In this thesis, we identify variability in the reuse behavior of cache blocks as the key limiting factor in maximizing cache efficiency for state-of-the-art predictive techniques. ... approach to classify a cache block as low or high reuse at the time of inserting a new block in the cache. ...

arXiv:2006.08487v1 fatcat:4stqscurbbcb3am5lw33c4pksm

In multithreaded applications, it guides the replacement strategy by monitoring the shared reuse state of every cache block inside the partition during runtime. ... For huge cores to make more efficient LLC visits, cache replacement algorithms generally modify the leftover state while preserving cache blocks needed by them. ... This work presents the first of its kind a novel core-to-core communications approach that relies on cache-aware data lookups. ...

doi:10.33545/2707661x.2022.v3.i2a.70 fatcat:u3zu4wrttjav5ig6jjgmmuqqqe

Cache bypassing is a promising technique to increase effective cache capacity without incurring power/area costs of a larger sized cache. ... However, injudicious use of cache bypassing can lead to bandwidth congestion and increased miss-rate and hence, intelligent techniques are required to harness its full potential. ... A block with no predicted reuse is stored in a bypass buffer while remaining blocks are stored in the cache. ...

doi:10.3390/jlpea6020005 fatcat:rkiqtcjbcvggde5utaqogg5xxa

DOAJ

A novel indirect control transfer chaining approach is proposed in this paper. ... Translated code is usually organized as code blocks in the code cache and each code block transfer control to the next one through a control transfer instruction. ... A software prediction approach is proposed in [11] to predict the target address of the indirect control transfer instructions. ...

doi:10.1007/978-3-642-23300-5_24 fatcat:xc65vdlaijgczcbhhedcg7akhq

In response, we introduce a new metric -Live Distance -that uses the stack distance to learn the temporal reuse characteristics of cache blocks, thus enabling a dead block predictor that is robust to variability ... This paper identifies variability in generational behavior of cache blocks as a key challenge for cache management policies that aim to identify dead blocks as early and as accurately as possible to maximize ... If predicted-live-distance is 0, the block is expected to have no reuse and is bypassed to higher level cache. ...

doi:10.1109/pact.2017.32 dblp:conf/IEEEpact/FalduG17 fatcat:7nu4squeivg6xe4mwqr2wqngda

The experimental results show that the path-based reuse distance is highly predictable, as a function of the data size, for a set of SPEC CPU2000 programs. ... Based on memory profiling, reuse distance analysis has shown much promise in predicting data locality for a program using inputs other than the profiled ones. ... Conclusions and Future Work In this paper, we have proposed a novel approach for path-based reuse-distance analysis. ...

doi:10.1007/11688839_4 fatcat:p7ihzzhuijdinazoztpv7l6jmm

We also propose a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature. ... A fundamental challenge, however, is how to best predict the re-reference pattern of an incoming cache line. ... We also thank Yu-Yuan Chen, Daniel Lustig, and the anonymous reviewers for their useful insights related to this work. This material is based upon work supported by the National Science References ...

doi:10.1145/2155620.2155671 dblp:conf/micro/WuJHMSE11 fatcat:mafbit5x7bbcpnq6i2zahkuwpe

Exploiting reuse information to reduce refresh energy in on-chip eDRAM caches

Preserved Fulltext

Dynamic and discrete cache insertion policies for managing shared last level caches in large multicores

Preserved Fulltext

Enhancing LTP-Driven Cache Management Using Reuse Distance Information

Preserved Fulltext

Learning Forward Reuse Distance [article]

Preserved Fulltext

Discrete Cache Insertion Policies for Shared Last Level Cache Management on Large Multicores

Preserved Fulltext

Tighter WCET analysis of input dependent programs with classified-cache memory architecture

Preserved Fulltext

Machine Learning Enabled Scalable Performance Prediction of Scientific Codes [article]

Preserved Fulltext

Other Versions

Kill the Program Counter

Preserved Fulltext

Addressing Variability in Reuse Prediction for Last-Level Caches [article]

Preserved Fulltext

Examining partitioned caches performance in heterogeneous multi-core processors

Preserved Fulltext

A Survey of Cache Bypassing Techniques

Preserved Fulltext

A Novel Chaining Approach to Indirect Control Transfer Instructions [chapter]

Preserved Fulltext

Leeway: Addressing Variability in Dead-Block Prediction for Last-Level Caches

Preserved Fulltext

Path-Based Reuse Distance Analysis [chapter]

Preserved Fulltext

SHiP

Preserved Fulltext