Processor Design Flow for architecture design

Varsha506533 13 views 17 slides Aug 26, 2024
Slide 1
Slide 1 of 17
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17

About This Presentation

Lecture Notes


Slide Content

Module II:
Memory: Organization, Memory segmentation,
Multithreading, Symmetric multiprocessing.
Processor Design flow: Capturing requirements,
Instruction coding, Exploration of architecture
organizations, hardware and software development.
Extreme CISC and extreme RISC ,Very long instruction
word (VLIW),

Module II: B: Processor
Design flow

Extreme CISC and Extreme RISC

Towards CISC
•Wired logic  microcode control
▫Temptingly easy extensibility
•Performance tuning
▫HW implementation of some high-level functions
•Marketing
▫Add successful instructions of competitors
▫“New feature” hype
▫Compatibility: only extensions are possible

CISC Problems
•Performance tuning unsuccessful
▫Rarely used high-level instructions
▫Sometimes slower than equivalent sequence
•High complexity
▫Pipelining bottlenecks  lower clock rates
▫Interrupt handling can complicate even more
•Marketing
▫Prolonged design time and frequent microcode errors
hurt competitiveness

RISC Features
•Low complexity
▫Generally results in overall speedup
▫Less error-prone implementation by hardwired logic
or simple microcodes
•VLSI implementation advantages
▫Less transistors
▫Extra space: more registers, cache
•Marketing
▫Reduced design time, less errors, and more options
increase competitiveness

RISC Compiler Issues
•The compilers themselves
▫Computationally more complex
▫More portable
•The compiler writer
▫Less instructions  probably easier job
▫Simpler instructions  probably less bugs
▫Can reuse optimization techniques

RISC vs. CISC misconceptions
•Arguments favoring RISC: simple design,
short design time, speed, price…

•Study of RISC should include
hardware/software tradeoffs, factors
influencing computer performance and
industry-side evaluation.

RISC vs. CISC misconceptions
•Incorrect implication from the two acronyms:
RISC and CISC.
▫They are not bifurcations between which designers
have to choose
•Carelessly leaving out the ‘participation’ of
Operating System

RISC vs. CISC misconceptions
•Reduced design time?
▫academic <-> industrial
•Performance claims of RISC proponent do not
decouple design features like MRSs.
▫MRSs can have a remarkable effect on program
execution

Conclusion – RISC vs. CISC?
•CISC
▫Effectively realizes one particular High Level
Language Computer System in HW - recurring
HW development costs when change needed
•RISC
▫Allows effective realization of any High Level
Language Computer System in SW - recurring SW
development costs when change needed

Conclusion – Optimum?
•Hybrid solutions
▫RISC core & CISC interface
▫Still has specific performance tuning
•Optimal ISA
▫Between RISC & CISC
▫Few, carefully chosen, useful complex instructions
▫Still has complexity handling problems

13
VLIW Processors
VLIW (“very long instruction word”) processors
•instructions are scheduled by the compiler
•a fixed number of operations are formatted as one big
instruction (called a bundle)
usually LIW (3 operations)
change in the instruction set architecture,
 i.e., 1 program counter points to 1 bundle (not 1
operation)
•operations in a bundle issue in parallel
fixed format so could decode operations in parallel
enough FUs for types of operations that can issue in
parallel
pipelined Fus

14
VLIW Processors
Goal of the hardware design:
•reduce hardware complexity
•to shorten the cycle time for better performance
•to reduce power requirements
How VLIW designs reduce hardware complexity
•less multiple-issue hardware
no dependence checking for instructions within a bundle
can be fewer paths between instruction issue slots &
FUs
•simpler instruction dispatch
no out-of-order execution, no instruction grouping
•ideally no structural hazard checking logic
•Reduction in hardware complexity affects cycle time &
power consumption

15
VLIW Processors
Compiler support to increase ILP
•compiler creates each VLIW word
•need for good code scheduling greater than with in-order
issue superscalars
instruction doesn’t issue if 1 operation can’t
•techniques for increasing ILP
loop unrolling
software pipelining (schedules instructions from different
iterations together)
aggressive in lining (function becomes part of the caller
code)
trace scheduling (schedule beyond basic block
boundaries)

16
VLIW Processors
More compiler support to increase ILP
•detects hazards & hides latencies
structural hazards
•no 2 operations to the same functional unit
•no 2 operations to the same memory bank
hiding latencies
•data prefetching
•hoisting loads above stores
data hazards
•no data hazards among instructions in a bundle
control hazards
•predicated execution
•static branch prediction

17
Superscalars vs. VLIW
Superscalar has more complex hardware for instruction
scheduling
•instruction slotting or out-of-order hardware
•more paths between instruction issue structure & functional units
•possible consequences:
slower cycle times
more chip real estate
more power consumption
but VLIW has more functional units if supports full predication
•possible consequences:
slower cycle times
more chip real estate
more power consumption
Tags