memorytechnologyandoptimization-140416131506-phpapp02.pptx

shahdivyanshu1002 9 views 30 slides Apr 30, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

memory


Slide Content

Memory Technology and Optimization Presented by: Trupti Diwan Shweta Ghate Sapana Vasave

Agenda Basics of memory Memory Technology Memory optimization

Memory Technology Main memory is RAM.(The words Memory, Buffer, Cache are all refers Ram). Which is Nearly 11,000 times faster than secondary memory (Hard Disk) in Random Access.) Characteristics of Main Memory is as vital as the processor chip to a computer system. Fast systems have both a fast processor and a large memory

Characteristics Here is a list of some characteristics of computer memory. closely connected to the processor. Holds programs and data that the processor is actively working with. Used for long term storage. The processor interacts with it millions of times per second. The contents is easily changed. Usually its contents are organized into files.

Main memory is the short term memory of a computer. It retains data only for the period that a program is running, and that's it. Memory is used for the purpose of running programs

Main memory and performance Main memory satisfies the demands of caches and serves as the I/O interface. Performance measures of main memory emphasize both latency and bandwidth (Memory bandwidth is the number of bytes read or written per unit time) Memory latency is the time delay required to obtain a specific item of data Memory Bandwidth is the rate at which data can be accessed (e.g. bits per second) Bandwidth unit is normally 1/cycle time This rate can be improved by concurrent access

The main memory affects the cache miss penalty, the primary concern of the cache. Main memory bandwidth the primary concern of I/O and multiprocessors .

  Although caches are interested in low latency memory, it is generally easier to improve memory bandwidth with new organizations than it is to reduce latency cache designers increase block size to take advantage of the high memory bandwidth .

Memory Latency Memory latency is traditionally quoted using two measures Access time Access time is the time between when a read is requested and when the desired word arrives and Cycle time Cycle time is the minimum time between requests to memory. Cycle time is greater than access time because the memory needs the address lines to be stable between accesses.

Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: Present the user with as much memory as is available in the cheapest technology. Provide access at the speed offered by the fastest technology. Control Datapath Secondary Storage (Disk) Processor Registers Main Memory (DRAM) Second Level Cache (SRAM) On-Chip Cache 1s 10,000,000s (10s ms) Speed (ns): 10s 100s 100s Gs Size (bytes): Ks Ms Tertiary Storage (Tape) 10,000,000,000s (10s sec) Ts

Memory Technology Static Random Access Memory (SRAM) - use for cache. Dynamic Random Access Memory (DRAM) - use for main memory

SRAM ‘S’ stands for static. No need to be refresh , so access time is close to cycle time. Uses 6 transistor per bit. Bits stored as on/off switches No charges to leak More complex construction Larger per bit More expensive Faster

Static RAM structure

Static RAM Operation Transistor arrangement gives stable logic state State 1 C 1 high, C 2 low T 1 T 4 off, T 2 T 3 on State 0 C 2 high, C 1 low T 2 T 3 off, T 1 T 4 on Address line transistors T 5 T 6 is switch Write – apply value to B & compliment to B Read – value is on line B

DRAM Bits stored as charge in capacitors Charges leak Need refreshing even when powered Simpler construction Smaller per bit Less expensive Need refresh circuits Slower Cycle time is longer than the access time

DRAM structure

DRAM Operation Address line active when bit read or written Transistor switch closed (current flows) Write Voltage to bit line High for 1 low for 0 Then signal address line Transfers charge to capacitor Read Address line selected transistor turns on Charge from capacitor fed via bit line to sense amplifier Compares with reference value to determine 0 or 1 Capacitor charge must be restored

DRAM Addresses divided into 2 halves (Memory as a 2D matrix): RAS or Row Access Strobe CAS or Column Access Strobe Fig : Internal Organization of a DRAM

Improving Memory performance inside a DRAM chip DRAM had an asynchronous interface to the memory controller so every transfer involved overhead to synchronize with controller. Adding clock signal to the DRAM interface reduces the overhead, such optimization called “Synchronous DRAM “ i.e SDRAMs

Double Data Rate i.e DDR was a later development of SDRAM, used in PC memory beginning in 2000. DDR SDRAM internally performs double-width accesses at the clock rate, and uses a double data rate interface to transfer one half on each clock edge. Further version of DDR(2 data transfer/cycle) -DDR2( 4 data transfer/cycle) -DDR3 ( 8 data transfer/cycle)

Memory optimization Review: 6 Basic Cache Optimizations • Reducing hit time 1.Address Translation during Cache Indexing • Reducing Miss Penalty 2. Multilevel Caches 3. Giving priority to read misses over write misses • Reducing Miss Rate 4. Larger Block size (Compulsory misses) 5. Larger Cache size (Capacity misses) 6. Higher Associativity (Conflict misses)

11 Advanced Cache Optimizations Reducing hit time 1. Small and simple caches 2. Way prediction 3. Trace caches • Increasing cache bandwidth 4. Pipelined caches 5. Multibanked caches 6. Nonblocking caches • Reducing Miss Penalty 7. Critical word first 8. Merging write buffers • Reducing Miss Rate 9.Compiler optimizations • Reducing miss penalty or miss rate via parallelism 10.Hardware prefetching 11.Compiler prefetching

Advanced cache optimizations into the following categories: Reducing the hit time: small and simple caches, way prediction, and trace caches Increasing cache bandwidth: pipelined caches, multibanked caches, and nonblocking caches Reducing the miss penalty: critical word first and merging write buffers Reducing the miss rate: compiler optimizations Reducing the miss penalty or miss rate via parallelism: hardware prefetching and compiler prefetching We will conclude with a summary of the implementation complexity and the performance

First Optimization: Small and Simple Caches to Reduce Hit Time A time-consuming portion of a cache hit is using the index portion of the address to read the tag memory and then compare it to the address. Smaller hardware can be faster, so a small cache can help the hit time. It is also critical to keep an L2 cache small enough to fit on the same chip as the processor to avoid the time penalty of going off chip. The second suggestion is to keep the cache simple, such as using direct mapping. One benefit of direct-mapped caches is that the designer can overlap the tag check with the transmission of the data. This effectively reduces hit time

Compiler techniques to reduce cache misses+ 0 Software is a challenge; some computers have compiler option Hardware prefetching of instructions and data + + 2 instr.,3 data Many prefetch instructions;Opteron and Pentium 4 prefetch data Compiler-controlled prefetching + + 3 Needs nonblocking cache; possible instruction overhead; in many CPUs

Figure 5.11 Summary of 11 advanced cache optimizations showing impact on cache performance and complexity. Although generally a technique helps only one factor, prefetching can reduce misses if done sufficiently early; if not, it can reduce miss penalty. + means that the technique improves the factor, – means it hurts that factor, and blank means it has no impact. The complexity measure is subjective, with 0 being the easiest and 3 being a challenge

cache Optimization Summary The techniques to improve hit time, bandwidth, miss penalty Miss rate generally affect the other components of the average memory access equation as well as the complexity of the memory hierarchy. Estimates the impact on complexity, with + meaning that the technique Improves the factor, – meaning it hurts that factor, and blank meaning it has no impact. Generally, no technique helps more than one category.

Refernces Computer Architecture - A Quantitative Approach

Thank You
Tags