0% found this document useful (0 votes)
51 views14 pages

Storage Hierarchy Storage Hierarchy

The document summarizes key aspects of memory hierarchy: 1) Memory hierarchies use multiple levels of storage with smaller, faster levels closer to the CPU and larger, slower levels further away. This exploits locality and improves performance over a single level. 2) Key metrics are hit rate, miss rate, miss penalty, and average memory access time. Miss penalty and rate determine overall access time. 3) Designing a memory hierarchy involves choices for block placement, identification, replacement, and handling writes. Placement can be fully associative, direct mapped, or set associative. Identification uses tags. Replacement policies include random and LRU. Writes can update caches or write-through to lower levels.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
Download as ppt, pdf, or txt
0% found this document useful (0 votes)
51 views14 pages

Storage Hierarchy Storage Hierarchy

The document summarizes key aspects of memory hierarchy: 1) Memory hierarchies use multiple levels of storage with smaller, faster levels closer to the CPU and larger, slower levels further away. This exploits locality and improves performance over a single level. 2) Key metrics are hit rate, miss rate, miss penalty, and average memory access time. Miss penalty and rate determine overall access time. 3) Designing a memory hierarchy involves choices for block placement, identification, replacement, and handling writes. Placement can be fully associative, direct mapped, or set associative. Identification uses tags. Replacement policies include random and LRU. Writes can update caches or write-through to lower levels.
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 14

Lecture 12

Storage Hierarchy

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 1


Who Cares about Memory
Hierarchy?
• Processor Only Thus Far in Course
– CPU cost/performance, ISA, Pipelined Execution
1000
CPU
Performance

100
55%/year

CPU-DRAM Gap
10

35%/year
7%/year
DRAM
1
1980

1984

1985

1986

1987

1988

1989

1990

1993

1996

1997

1998

1999

2000
1981

1982

1983

1991

1992

1994

1995

• 1980: no cache in mproc;


• 1995 2-level cache, 60% transistors on Alpha 21164 mproc
Storage Hierarchy CS510 Computer Architecture Lecture 12 - 2
General Principles
• Locality
– Temporal Locality: referenced again soon
– Spatial Locality: nearby items referenced soon
• Locality + smaller HW is faster = memory hierarchy
– Levels: smaller, faster, more expensive/byte than the level below
– Inclusive: data found in top also found in the bottom
• Definitions
– Upper is closer to processor
– Block: minimum unit that present or not in the upper level
– Address = Block frame address + block offset address
– Hit time: time to access the upper level, including hit determination

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 3


Measures
• Hit rate: fraction found in that level
– So high that usually talk about Miss Rate or Fault Rate
– Miss rate fallacy: as MIPS to CPU performance,
miss rate to average memory access time in memory
• Average memory-access time = Hit time + Miss rate
x Miss penalty (ns or clocks))
• Miss penalty:: time to replace a block from the lower level,
including to replace in CPU
– access time: time to access the lower level =ƒ (lower level latency)
– transfer time: time to transfer block =ƒ (BW upper & lower, block size)

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 4


Block Size vs. Measures
Increasing Block Size generally increases Miss Penalty and
decreases Miss Rate

Average
Miss Miss
Memory
Rate
Penalty Transfer Access
Time = Time

Access
Time
Block Size Block Size Block Size

Miss Penalty x Miss Rate = Avg. Memory Access Time(AMAT)

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 5


Implications for CPU
• Fast hit check since every memory access needs it
– Hit is the common case
• Unpredictable memory access time
– 10s of clock cycles: wait
– 1000s of clock cycles:
• Interrupt & switch & do something else
• New style: multithreaded execution
• How to handle miss (10s => HW, 1000s => SW)?

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 6


4 Questions for
Memory Hierarchy Designers
• Q1: Where can a block be placed in the upper level? (Block
placement)

• Q2: How is a block found if it is in the upper level?


(Block identification)

• Q3: Which block should be replaced on a miss?


(Block replacement)

• Q4: What happens on a write?


(Write strategy)

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 7


Q1: Block Placement:
Where can a Block be Placed in the
Upper Level?
Block 12 placed in Fully Associative: Direct mapped: Set Associative:
Block 12 can go Block 12 can go Block 12 can go
8 block cache anywhere only into Block 4 anywhere in Set 0
Block (12 Mod 8)= 4 (12 Mod 4)= 0
–Fully Associative(FA), Number 0 1 234 567 0 123 4 56 7 0 12 34 56 7
Direct Mapped,
2-
way Set Associative(SA)
–SA Mapping ;
(Block #) Modulo(# of Sets)
Set Set Set Set
Block Frame Address 0 1 2 3
Block 1 11 11 1
Number 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5

Memory

...

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 8


Q2: Block Identification:
How to Find a Block in the Upper Level?

• Tag on each block


– No need to check index or block offset
• Increasing associativity shrinks index, expands tag

Block Address Block


Tag Index Offset

FAM: No index
DM: Large index

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 9


Q3: Block Replacement:
Which Block Should be Replaced
on a Miss?
• Easy for Direct Mapped
• SAM or FAM:
– Random
– LRU

Miss Rates
Associativity: 2-way 4-way 8-way
Cache Size LRU Random LRU Random LRU Random
16 KB 5.18% 5.69% 4.67% 5.29% 4.39% 4.96%
64 KB 1.88% 2.01% 1.54% 1.66% 1.39% 1.53%
256 KB 1.15% 1.17% 1.13% 1.13% 1.12% 1.12%

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 10


Q4: Write Strategy:
What Happens on a Write?
• DLX : store 9%, load 26% in integer programs
– STORE:
• 9%/(100%+26%+9%) ≈ 7% of the overall memory traffic
• 9%/(26%+9%) ≈ 25% of the data cache traffic
– READ access is majority, thus to make the common case
fast: optimizing caches for reads
– High performance designs cannot neglect the speed of
WRITEs

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 11


Q4: What Happens on a Write?
• Write Through: The information is written to both the block in
the cache and to the block in the lower-level memory.
• Write Back: The information is written only to the block in the
cache. The modified cache block(Dirty Block) is written to
main memory only when it is replaced.
– is block clean or dirty?
• Pros and Cons of each:
– WT: read misses cannot result in writes (because of
replacements)
– WB: no writes of repeated writes
• WT needs to be combined with write buffers so that don’t wait
for lower level memory

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 12


Q4: What Happens on a Write?
• Write Miss
– Write Allocate (fetch on write)
– No-Write Allocate (write around)
• WB caches generally use Write Allocate, while WT caches
often use No-Write Allocate

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 13


Summary
• CPU-Memory gap is major performance obstacle for
performance, HW and SW
• Take advantage of program behavior: locality
• Time of program still only reliable performance measure
• 4Qs of memory hierarchy
– Block Placement
– Block Identification
– Block Replacement
– Write Strategy

Storage Hierarchy CS510 Computer Architecture Lecture 12 - 14

You might also like