ABSTRACT
While set-associative caches incur fewer misses than direct-mapped caches, they typically have slower hit times and higher power consumption, when multiple tag and data banks are probed in parallel. This paper presents the location cache structure which significantly reduces the power consumption for large set-associative caches. We propose to use a small cache, called location cache to store the location of future cache references. If there is a hit in the location cache, the supported cache is accessed as a direct-mapped cache. Otherwise, the supported cache is referenced as a conventional set-associative cache.The worst case access latency of the location cache system is the same as that of a conventional cache. The location cache is virtually indexed so that operations on it can be performed in parallel with the TLB address translation. These advantages make it ideal for L2 cache systems where traditional way-predication strategies perform poorly.We used the CACTI cache model to evaluate the power con-sumption and access latency of proposed cache architecture. Simplescalar CPU simulator was used to produce final results. It is shown that the proposed location cache architecture is power-efficient. In the simulated cache configurations, up-to 47% of cache accessing energy and 25% of average cache access latency can be reduced.
- C. Su and A. Despain, "Cache design tradeoffs for power and performance optimization: A case study," in International Symposium on Low Power Electronics and Design, pp. 63--68, 1997. Google ScholarDigital Library
- K. Ghose and M. B. Kamble, "Reducing power in superscalar processor caches using subbanking, multiple line buffers and bit-line segmentation," in International Symposium on Low Power Electronics and Design, pp. 70--75, 1999. Google ScholarDigital Library
- U. Ko, P. T. Balsara, and A. K. Nanda, "Energy optimization of multi-level process cache architectures," in Prod. of the 1995 Internation Symposium on Low Power Design, pp. 45--49, 1995. Google ScholarDigital Library
- D. H. Albonesi, "Selective cache ways: on-demand cache resource allocation," in Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture, pp. 248--259, 1999. Google ScholarDigital Library
- J. Kin, M. Gupta, and W. Mangione-Smith, "The filter cache: an energy efficient memory structure," in 30th Annual International Symposium on Microarchitecture (Micro '97), pp. 184--193, December 1997. Google ScholarDigital Library
- A. Hasegawa, I. Kawasaki, K. Yamada, S. Yoshioka, S. Kawasaki, and P. Biswas, "Sh3: High code density, low power," IEEE Micro, vol. 15, pp. 11--19, December 1995. Google ScholarDigital Library
- T. Lyon, E. Delano, C. McNairy, and D. Mulla, "Data cache design considerations for the itanium2 processor," in Proceedings of the 2002 IEEE International Conference on Computer Design: VLSI in Computers and Processors (ICCD'02), pp. 356--362, 2002. Google ScholarDigital Library
- C. Zhang, F. Vahlid, and W. Najjar, "A highly configurable cache architecture for embedded systems," in The Prod. of the 30th Annual International Symposium on Computer Architecture (ISCA03), pp. 125--136, 2003. Google ScholarDigital Library
- G. Memik, G. Reinman, and W. Mangio-Smith, "Just say no: Benefits of early cache miss determination," in Prod. of the Ninth International Symposium on High-Performance Computer Architecture, pp. 307--316, 2003. Google ScholarDigital Library
- A. Agarwal, J. Hennesy, and M. Horowits, "Cache performance of operating systems and multiprogramming," in ACM Transactions on Computer Systems, pp. 393--431, November 1988. Google ScholarDigital Library
- A. Agarwal and S. D. Pudar, "Column-associative caches: a technique for reducing the miss rate of direct-mapped caches," in Proc. of the 35th annual International Syposium on Computer Architecture (ISCA), pp. 179--190, 1993. Google ScholarDigital Library
- J. H. Chang, H. Chao, and K. So., "Cache design of a sub-micron cmos system/370," in 14th Annual International Symposium on Computer Architecture, SIGARCH Newsletter, pp. 208--213, June 1987. Google ScholarDigital Library
- B. Calder, D. Grunwald, and J. Emer, "Predictive sequential associative cache," in Proc. of the 2nd IEEE Symposium on High-Performance Computer Architecture (HPCA '96), pp. 244--254, 1996. Google ScholarDigital Library
- T. N. Vijaykumar, "Reactive-associative caches," in International Conference on Parallel Architectures and Compiler Techinques (PACT'01), pp. 49--61, 2001. Google ScholarDigital Library
- S.Dropsho, A. Buyuktonsunoglu, D. H. A. R. Balasubramonian, G. S. S. Dwarkadas, G. Magklis, and M. Scott, "Integrating adaptive on-chip storage structures for reduced dynamic power," in International Conference on Parallel Architectures and Compilation Techniques (PACT02), pp. 190--202, 2002. Google ScholarDigital Library
- K. Inoue, T. Ishihara, and K. Murakami, "Way-predicting set-associative cache for high performance and low energy consumption," in International Symposium on Low Power Electronics and Design, pp. 273--275, 1999. Google ScholarDigital Library
- M. Powell, A. Agrawal, T. Vijaykumar, B. Falsafi, and K. Roy, "Reducing set-associative cache energy via way-prediction and selective direct-mapping," in 34th Annual International Symposium on Microarchitecture (MICRO'01), pp. 54--65, December 2001. Google ScholarDigital Library
- T. Juan, T. Lang, and J. J. Navarro, "The difference-bit cache," in Proc. of the 23rd annual international symposium on computer architecture, pp. 114--120, 1996. Google ScholarDigital Library
- L. Liu, "Cache designs with partial address matching," in Proc. of the 27 Internaltional symposium on microarchitecture, pp. 128--136, 1994. Google ScholarDigital Library
- K. A., N. Chander, P. S., and J. L., "Modeling and analysis of the difference-bit cache," in Proc. of the 8th Great Lakes Symposium on VLSI, pp. 140--145, 1998. Google ScholarDigital Library
- Z. Hu, S. Kaxiras, and M. Martonosi, "Let caches decay: reducing leakage energy via exploitation of cache generational behavior," ACM Transactions on Computer Systems, vol. 20, no. 11, pp. 161--190, 2002. Google ScholarDigital Library
- M. Zhang and K. Asanovic, "Fine-grain cam-tag cache resizing using miss tags," in Proceedings of the 2002 international symposium on Low power electronics and design(ISLPED'02), pp. 130--135, 2002. Google ScholarDigital Library
- K. Flautner, N. Kim, S. Martin, D. Blaauw, and T. Mudge, "Drowsy caches: Simple techniques for reducing leakage power," in International Symposium on Computer Architecture, pp. 148--158, June 2002. Google ScholarDigital Library
- H. Zhou, M. C. Toburen, E. Rotenberg, and T. M. Conte, "Adaptive mode control: A static-power-efficient cache design," in International Conference on Parallel Architectures and Compilation Techniques (PACT'01), (Barcelona, Spain), pp. 61--73, September 2001. Google ScholarDigital Library
- V. Moshnyaga and H. Tsuji, "Cache energy reduction by dual voltage supply," in The 2001 IEEE International Symposium on Circuits and Systems (ISCAS 2001), pp. 922--925, May 2001.Google Scholar
- J. Yang and R. Gupta, "Energy efficient frequent value data cache design," in IEEE/ACM 35th International Symposium on Microarchitecture (MICRO), pp. 197--207, nov. 2002. Google ScholarDigital Library
Index Terms
- Location cache: a low-power L2 cache system
Recommendations
Improving Performance of Large Physically Indexed Caches by Decoupling Memory Addresses from Cache Addresses
Modern CPUs often use large physically indexed caches that are direct-mapped or have low associativities. Such caches do not interact well with virtual memory systems. An improperly placed physical page will end up in a wrong place in the cache, causing ...
Exploiting temporal locality in drowsy cache policies
CF '05: Proceedings of the 2nd conference on Computing frontiersTechnology projections indicate that static power will become a major concern in future generations of high-performance microprocessors. Caches represent a significant percentage of the overall microprocessor die area. Therefore, recent research has ...
Energy-efficient synonym data detection and consistency for virtual cache
The cache memory consumes a large proportion of the energy used by a processor. In the on-chip cache, the translation lookaside buffer (TLB) accounts for 20-50% of energy consumption of the on-chip cache. To reduce energy consumption caused by TLB ...
Comments