An energy efficient TCAM enhanced cache architecture
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Microprocessors are used in a variety of systems ranging from high-performance super computers running scientific applications to battery powered cell phones performing realtime tasks. Due to the large disparity between processor clock speed and main memory access time, most modern processors include several caches, which consume more than half of the total chip area and power budget. As the performance gap between processors and memory has increased, the trend has been to increase the size of the on-chip caches. However, increasing the cache size also increases its access time and energy consumptions. This growing power dissipation problem is making traditional cooling and packaging techniques less effective thus requiring cache designers to focus more on architectural level energy efficiency than performance alone. The goal of this thesis is to propose a new cache architecture and to evaluate its efficiency in terms of miss rate, system performance, energy consumption, and area overhead. The proposed architecture employs the use of a few Ternary-CAM (TCAM) cells in the tag array to enable dynamic compression of tag entries containing contiguous values. By dynamically compressing tag entries, the number of entries in the tag array can be reduced by 2N, where N is the number of tag bits that can be compressed. The architecture described in this thesis is applicable to any cache structure that uses Content Addressable Memory (CAM) cells to store tag bits. To evaluate the effectiveness of the TCAM Enhanced Cache Architecture for a wide scope of applications, two case studies were performed ?? the L2 Data-TLB (DTLB) of a high-performance processor and the L1 instruction and data caches of a low-power embedded processor. Results indicate that a L2 DTLB implementing 3-bit tag compression can achieve 93% of the performance of a conventional L2 DTLB of the same size while reducing the on-chip energy consumption by 74% and the total area by 50%. Similarly, an embedded processor cache implementing 2-bit tag compression achieves 99% of the performance of a conventional cache while reducing the on-chip energy consumption by 33% and the total area by 10%.