What is Cache Memory? Cache memory is a small, high speed RAM buffer located between the CPU and Main Memory. Why Cache is needed? The cache memory is required to balance the speed mismatch between the main memory and the CPU. The clock of the processor is very fast, while the main memory access time is comparatively slower. Hence, the processing speed depends more on the speed of the main memory.
It stores and retains data only until a computer is powered up. The cache memory stores copies of the data from frequently used main memory locations. Based on a property called locality of reference. Temporal: a recently executed instruction is likely to be executed again very soon. Spatial: instructions close to a recently executed instruction are also likely to be executed soon
LEVELS OF CACHE Most processor chips include at least one L1 cache. Often two separate L1 caches, one for instructions and another for data. In high-performance processors, two levels of caches are normally used: Separate L1 caches for instructions and data and a larger L2 cache. These caches are often implemented on the processor chip. A typical computer may have: L1 caches with tens of kilobytes L2 cache of hundreds of kilobytes or possibly several megabytes.
CACHE STRUCTURE AND ORGANISATION
CACHE BLOCK A cache block refers to a fixed-size chunk or line of contiguous data loaded from main memory into the cache. Common cache block sizes range from 32 bytes to 128 bytes, with 64 bytes being a popular choice in modern systems. When the processor needs data from memory, an entire cache block containing that addressed data is fetched and loaded into the cache. Subsequent accesses to data within the same cache block can be served directly from the faster cache memory without having to go to slower main memory again.
Write Through: Data is written to both the cache and main memory simultaneously on every write operation. Ensures that main memory is always consistent with the cache. Slower since every write goes to slower main memory. Used when data consistency is critical, like in database applications. Write Back: Data is written only to the cache initially on a write operation. The modified cache data is written to main memory only when that cache line is evicted to make room for new data. Faster than write through since writes happen to faster cache first. Main memory may be inconsistent with cache until the modified data is written back. Used when higher write performance is needed by delaying writes to main memory.
Write Behind: Similar to write back, data is written to cache first. However, the modified cache lines are copied to a write buffer rather than directly to main memory. These buffered writes are subsequently written to main memory asynchronously at a more convenient time. Allows the overlap of processor writes with memory writes for improved performance. Used in systems that can tolerate some data inconsistency for improved throughput.
The drawback of direct mapping High conflict miss = We had to replace a cache memory block even when the other block in cache memory was present as empty. The solution to this is associative mapping. Associative mapping : While transferring the data from main memory to cache lines, it checks the availability of the line. If the line is free, immediately data is transferred without any formulae (rules). If the line is not available, with the help of a replacement algorithm any one of the lines is replaced by the new data.
Tag Word Associative mapping addressing scheme It has zero percent conflict miss. It requires very high tag comparisons
Set associative mapping combines direct mapping with fully associative mapping by arrangement lines of a cache into sets. The sets are persistent using a direct mapping scheme. However, the lines within each set are treated as a small fully associative cache where any block that can save in the set can be stored to any line inside the set.