Ph. D. Research Plan presentation by Anup Gangwar

RahulGITAM 24 views 31 slides Sep 02, 2024
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

Ph. D. Research Plan presentation by Anup Gangwar


Slide Content

Embedded Systems Group
(http://www.cse.iitd.ac.in/esproject)
Department of Computer Science & Engineering
Indian Institute of Technology Delhi
June 11, 2002
Ph.D. Research Plan Presentation
Anup Gangwar

Slide Slide 22Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 33Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Introduction
Why customize architectures?
General purpose computing domain Vs embedded
Customization leads to cheaper design solutions
Architectural choices for exploiting ILP
Superscalar processors
Try to extract ILP at run time, so, complex hardware
Limited clock speeds and high power dissipation
Not suited for embedded type of applications
VLIW processors
Compiler has lot of knowledge about hardware
Compiler extracts ILP statically, so, simplified hardware
Possible to attain higher clock speeds

Slide Slide 44Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Introduction - Problems with VLIW Processors
Complex compiler required for extracting ILP
Adequate hardware support needed for compiler
controlled execution
Code size expansion due to explicit NOPs if,
The application does not contain enough parallelism
The compiler is not able to extract parallelism from the application
Need for good instruction encoding and NOP compression
schemes

Slide Slide 55Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 66Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> FUs

Slide Slide 77Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> FUs (contd...)
Functional Unit Types
MISO or Multiple Input Single Output
MIMO or Multiple Input Multiple Output
MIMO with LD/ST or MIMOs with memory interaction
Rigid or flexible I/O timeshapes
NAME Inputs and Sources Outputs and Dests. I/O Policy
MISO Multiple (Regfile) Single (Regfile) Flexible or Rigid
MIMO Multiple (Regfile) Multiple (Regfile) Flexible or Rigid
MIMO with
LD/ST
Multiple (Regfile or
Mem.)
Multiple (Regfile or
Mem.)
Flexible or Rigid for Reg.
and block LD/ST for
mem.

Slide Slide 88Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Reg. File
Single register file organization doesn’t scale well
Area grows as N
3
Delay grows as N
3/2
Power grows as N
3
where N is the no. of Functional Units connected to the register file
Clustered VLIW architectures are the solution
Each FU can read from/write to only a subset of registers
Data copying may increase execution latency
Powerful application analysis required to overcome above
mentioned problems

Slide Slide 99Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Reg. File (contd...)
A Clustered VLIW Architecture

Slide Slide 1010Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Interconnect
Clustering FUs together requires deciding ICN
between different clusters
between clusters and memory
Analysis of data access patterns required for
evaluating cost-performance tradeoffs
Current ASIP vendors do not offer customizable
interconnects

Slide Slide 1111Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Encoding
Instruction encoding/decoding scheme affects
Code size
Object code compatibility
Branch miss prediction penalty
Hardware cost
Address specification in code size
Each UniOp is equivalent to a RISC/CISC instruction
UniOp UniOp UniOp UniOp
MultiOp

Slide Slide 1212Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Encoding (contd...)
ADD NOP FMUL NOP
IALU.0 IALU.1 FALU.0 BU.0
NOPs in a MultiOp
VLIW Processor Pipeline with Instruction Decompressor

Slide Slide 1313Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Specialization Opportunities -> Summary

Slide Slide 1414Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 1515Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Existing Methodologies -> Simulation Driven

Slide Slide 1616Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
VLIW ASIP Synthesis Methodology
Task Set and
Constraints
Architecture
Description
Architecture Design Space Exploration
Application Parameter
Extraction
Retargetable Compiler
Instruction Encoding Specialization
Validation
(Simulation with encoded instructions)
Architecture Description
(Output to synthesizer)

Slide Slide 1717Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 1818Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Validation Framework -> Trimaran
C Program
IMPACT
SIMULATOR Generator
ELCOR
Bridge Code
ELCOR IR
HMDES Machine Description
Generated Simulator
(Statistics)
•ANSI C Parsing
•Code profiling
•Classical machine independent optimizations
•Block formation
•Machine dependent
code optimizations
•Code scheduling
•Register allocation
•ELCOR IR to low level C files
•HPL-PD virtual machine
•Cache simulation
•Performance statistics
•Compute and
stall cycles
•Cache stats
•Spill code info

Slide Slide 1919Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Validation Framework -> Trimaran (contd...)
Code Processor
Native Compiler
REBEL
HMDES
Low level C files C libraries Emulation Library
Executable for the host platform

Slide Slide 2020Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Validation Framework -> Retargetable
Assembler
Instruction Encoding
Description
Toolkit Generator
Generated AssemblerAssembly Instructions
Object Code
To Simulator
(for simulation with encoded instructions)

Slide Slide 2121Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 2222Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Plan -> Interconnect/RF/FU
Specialization
Initially model the interconnect problem as ILP and
later on move to other solutions
Code selection problem in compilers is similar to
identifying compute intensive parts for AFUs
No. and type of FUs has not been properly explored
RF clustering problem has not been dealt with
elsewhere
Jacome et. al.
Deal with Interconnect/RF/FU specialization simultaneously
Operation chaining is not considered

Slide Slide 2323Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Plan -> Encoding/Decoding
Specialization
Goal is to be able to generate encoding schemes
automatically
Work of Shail Aditya et. al.
Basically a parameterized encoding scheme
Techniques especially for HPL-PD architecture
Do not talk of dynamic code size minimization
Encoding template is fixed exploration limited only to within
the template design space
Various encoding templates need to be explored, also
the template itself may be derived from application
Dynamic code size minimization needs to be
considered

Slide Slide 2424Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 2525Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Specialized FUs in Trimaran
Modeling MISOs
Model as external function calls
Replace in Trimaran bridge code and replace with AFU op
Model new AFU in MDES with the required ops
Introduce the semantics in simulator op definitions file
Modeling MIMOs
Model as external function calls returning voids
Replace in Trimaran bridge code and replace with AFU op
Explicitly reserve registers in C-code for returning values
Introduce operation semantics in simulator op definition
file

Slide Slide 2626Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Specialized FUs in Trimaran
(contd...)
Modeling MIMOs with LD/ST
Model as regular MIMOs
Memory interaction with block LD/ST at beginning and end
of execute cycles
Additionally
Possible to impose register file constraints
Various I/O timeshapes, rigid or flexible
Possible to introduce pipelined functional units

Slide Slide 2727Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Instruction Enc. in Trimaran

Slide Slide 2828Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Instruction Enc. in Trimaran (contd...)
New Jersey Machine Code Toolkit (NJMC)
Deals with bits at symbolic level
Can be used to write assemblers/disassemblers
Specification in SLED (Specification Language for
Encoding/Decoding)
Model instruction decompressor in HMDES
Instrument ELCOR to generate assembly code
Encoding is done using procedures generated by NJMC
Problems with NJMC
VLIW instruction need to be broken up into 32 bit tokens
Encoded instructions must end on 8 bit boundary

Slide Slide 2929Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Work Status -> Code Gen. for Clustered ASIPs
ELCOR
Disadvantages
ELCOR is heavily oriented towards HPL-PD architecture
Does not support clustered VLIW architecture
Advantages
Strong optimizing compiler
Rich library to deal with the IR
IMPACT compiler system offers another choice for
building a backend
Feasibility study being carried out to fix a particular
direction of work

Slide Slide 3030Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
Presentation Outline
Introduction and motivation
Specialization opportunities in VLIW processors
Methodology
Validation framework (supporting tools required)
Work plan
Status of work
References

Slide Slide 3131Research Plan Presentation, June 11, 2002Research Plan Presentation, June 11, 2002 http://www.cse.iitd.ac.in/esprojecthttp://www.cse.iitd.ac.in/esproject
References
Bhuvan Middha, Varun Raj, Anup Gangwar, M. Balakrishnan, Anshul Kumar
and Paolo Ienne, “A Trimaran based framework for exploring design space of
VLIW ASIPs with coarse grain FUs”, ISSS-2002.
Anup Gangwar, M. Balakrishnan and Anshul Kumar, “A framework for
studying the effect of VLIW processor instruction encoding and decoding
schemes”, Mini Project, Dept. of CSE.
M. Jacome and G. de. Veciana, “Design challenges for new application specific
processors”, IEEE Design and Test of Computers-2000.
B. Ramakrishna Rau and Michael S. Schlansker, “Embedded computer
architecture and automation”, IEEE Computer-2001
Michael S. Schlansker and B. Ramakrishna Rau, “EPIC: An architecture for
instruction-level parallel processors”, HPCA-2000.
N. G. Busa, A. van der Werf and M. Bekooij, “Scheduling coarse grain
operations for VLIW processors”, ASPDAC-1998.
Shail Aditya, Scott A. Mahlke and B. Ramakrishna Rau, “Code size minimization
and retargetable assembly for custom EPIC and VLIW processors”, ISSS-1999.
Tags