Vliw

ajal4u 7,216 views 24 slides Aug 11, 2014
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

checking dependencies between instructions to determine which instructions can be grouped together for parallel execution;
assigning instructions to the functional units on the hardware;
determining when instructions are initiated placed together into a single word.


Slide Content

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Superscalar and VLIW
Architectures
VLSI ARCHITECTURES

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Outline
•Types of architectures
•Superscalar
•Differences between CISC, RISC and VLIW
•VLIW ( very long instruction word )

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Parallel processing
Processing instructions in parallel requires three
majortasks:
1.checking dependencies between instructions to
determine which instructions can be grouped
together forparallel execution;
2.assigning instructions to thefunctional units on
the hardware;
3.determining wheninstructions are initiatedplaced
together into a single word.

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Major categories
VLIW –Very Long Instruction Word
EPIC –ExplicitlyParallel Instruction Computing

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Superscalar Processors
•Superscalarprocessorsaredesignedtoexploitmore
instruction-levelparallelisminuserprograms.
•Onlyindependentinstructionscanbeexecutedin
parallelwithoutcausingawaitstate.
•Theamountofinstruction-levelparallelismvaries
widelydependingonthetypeofcodebeingexecuted.

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Pipelining in Superscalar Processors
•Inordertofullyutiliseasuperscalarprocessorof
degreem,minstructionsmustbeexecutablein
parallel.Thissituationmaynotbetrueinallclock
cycles.Inthatcase,someofthepipelinesmaybe
stallinginawaitstate.
•Inasuperscalarprocessor,thesimpleoperation
latencyshouldrequireonlyonecycle,asinthebase
scalarprocessor.

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Superscalar Execution

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Superscalar Implementation
•Simultaneously fetch multiple instructions
•Logic to determine true dependencies involving
register values
•Mechanisms to communicate these values
•Mechanisms to initiate multiple instructions in
parallel
•Resources for parallel execution of multiple
instructions
•Mechanisms for committing process state in
correct order

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
VLIW History
The term coined by J.A. Fisher (Yale) in 1983
ELI S12 (prototype)
Trace (Commercial)
Origin lies in horizontal microcode optimization
Another pioneering work by B. Ramakrishna Rau in
1982
Poly cyclic (Prototype)
Cydra-5 (Commercial)
Recent developments
Trimedia –Philips
TMS320C6X –Texas Instruments

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
The VLIW Architecture
•AtypicalVLIW(verylonginstructionword)machine
hasinstructionwordshundredsofbitsinlength.
•Multiplefunctionalunitsareusedconcurrentlyina
VLIWprocessor.
•Allfunctionalunitssharetheuseofacommonlarge
registerfile.

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Why Superscalar Processors are
commercially more popular as
compared to VLIW processor ?
Binary code compatibility among scalar &
superscalar processors of same family
Same compiler works for all processors (scalars
and superscalars) of same family
Assembly programming of VLIWs is tedious
Code density in VLIWs is very poor
-Instruction encoding schemes
Area Performance

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Data path : A simple VLIW Architecture
FU FU FU
Register file
Scalability ?
Access time, area, power consumption sharply increase with
number of register ports

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Data path : Clustered VLIW Architecture
(distributed register file)
FU FU
Register file
FU FU
Register file
FU FU
Register file
Interconnection Network

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Coarse grain Fus with
VLIW core
MULT RAM ALU
Coarse grain
FU
Reg2Reg1Reg1Reg1 Reg2 Reg2
Multiplexer network
Micro
Code
IR
Prg. Counter
Logic
Embedded (co)-processors as Fus in a VLIW architecture

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Application Specific FUs
FUfunctionality
number of inputs
number of outputs
latency initiation intervalI/O time shape
Functional Units

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Comparison: CISC, RISC, VLIW

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Advantages of VLIW
Compiler prepares fixed packets of multiple
operations that give the full "plan of execution"
–dependencies are determined by compiler and used to
schedule according to function unit latencies
–function units are assigned by compiler and
correspond to the position within the instruction
packet ("slotting")
–compiler produces fully-scheduled, hazard-free code
=> hardware doesn't have to "rediscover"
dependencies or schedule

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Disadvantages of VLIW
Compatibility across implementations is a major
problem
–VLIW code won't run properly with different number
of function units or different latencies
–unscheduled events (e.g., cache miss) stall entire
processor
Code density is another problem
–low slot utilization (mostly nops)
–reduce nops by compression ("flexible VLIW",
"variable-length VLIW")

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
References
1.AdvancedComputerArchitectures,Parallelism,Scalability,
Programmability,K.Hwang,1993.
2.M. Smotherman, "Understanding EPIC Architectures and
Implementations" (pdf)
http://www.cs.clemson.edu/~mark/464/acmse_epic.pdf
3.Lecture notes of Mark Smotherman,
http://www.cs.clemson.edu/~mark/464/hp3e4.html
4.An Introduction To Very-Long Instruction Word (VLIW) Computer
Architecture, Philips Semiconductors,
http://www.semiconductors.philips.com/acrobat_download/other
/vliw-wp.pdf
5.Texas Instruments, Tutorial on TMS320C6000 VelociTI Advanced
VLIW Architecture.
http://www.acm.org/sigs/sigmicro/existing/micro31/pdf/m31_sesha
n.pdf

VLSI DESIGN GROUP –METS SCHOOL OF ENGINEERING , MALA
Thanks