introduction to digital signal processing systems

jasontseng19 54 views 68 slides Sep 08, 2024
Slide 1
Slide 1 of 68
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68

About This Presentation

introduction to DSP system


Slide Content

1
Introduction to
Digital Signal
Processing Systems
Shao-Yi Chien

DSP in VLSI Design Shao-Yi Chien 2
Outline
Introduction
Typical DSP algorithms
Scaled CMOS technologies
Representations of DSP algorithms

DSP in VLSI Design Shao-Yi Chien 3
Analog Signal
Real-word signal
Infinite accuracy on time and magnitudet
f
f(5.17382…)=3.7416…

DSP in VLSI Design Shao-Yi Chien 4
Digital Signal
Get after sampling and quantization
Finite accuracy on time and magnitude
Easy to process with digital processing
elementt
f t
f
Sampling Quantization
f(5)=4

DSP in VLSI Design Shao-Yi Chien 5
Typical DSP Systems
Sampling
Quantization
Reconstruction
Inverse Quantization
SENSORS
Digital Inputs
Analog Inputs
DIGITAL SIGNAL
PROCESSINGA/D D/A
User InterfaceDisplay
ACTUATORS
Digital Outputs
Analog Outputs

DSP in VLSI Design Shao-Yi Chien 6
Advantages of Analog Signal
Processing
Can operate in very high frequency
Sometimes low area
Low power

DSP in VLSI Design Shao-Yi Chien 7
Advantages of Digital Signal
Processing (DSP)
More robust
Insensitive to environment and component tolerance
The accuracy can be controlled better
Can cancel the noise and interference while
amplifying the signal
Predictable, repeatable behavior
Can be stored and recovered, transmitted and
received, processed and manipulated without error

DSP in VLSI Design Shao-Yi Chien 8
Features of DSP Systems
Real-time throughput requirement
So-called hard real-time systems
Data-driven property
Non-terminating program

DSP in VLSI Design Shao-Yi Chien 9
Hard Real-Time Systems
Courtesy Warner Brothers Studios

DSP in VLSI Design Shao-Yi Chien 10

DSP in VLSI Design Shao-Yi Chien 11
Performance Metrics of DSP
Systems
Hardware circuitry and resources (area)
Speed of execution
Power consumption
Finite word length performance

DSP in VLSI Design Shao-Yi Chien 12
Characteristics of DSP Systems
(1/4)
Data format
1D speech
2D image
3D video
time
time

DSP in VLSI Design Shao-Yi Chien 13
Characteristics of DSP Systems
(2/4)
Algorithms

DSP in VLSI Design Shao-Yi Chien 14
Characteristics of DSP Systems
(3/4)
Sample
rates

DSP in VLSI Design Shao-Yi Chien 15
Characteristics of DSP Systems
(4/4)
Clock rates
Numeric representations

DSP in VLSI Design Shao-Yi Chien 16
Standard Digital Signal
Processors (1/2)
Allow rapid prototyping and time-to-market
Sometimes, the execution speed and code
size is reasonably good
Not always cost effective
Often cannot meet the requirements of
throughput, power consumption, and size

DSP in VLSI Design Shao-Yi Chien 17
Standard Digital Signal
Processors (2/2)
DSP Architectures
Harvard architecture
MAC
Fixed-point arithmetic
Alternatives:
DSP-enhanced CPU
GPU

DSP in VLSI Design Shao-Yi Chien 18
Application-Specific ICs for DSP
(1/2)
Better performances
Processing capacity
Power consumption
Pin-restriction problem
Main problem: the system is very complex
to design
Long time-to-market

DSP in VLSI Design Shao-Yi Chien 19
Application-Specific ICs for DSP
(2/2)
Large design space
Hard to find optimal solution
Systemspecification
algorithmhardware
architecturelogic
implementationVLSI
implementation
ASIC Accelerator design in
an SoC

DSP in VLSI Design Shao-Yi Chien 20
Typical DSP Algorithms
Convolution
Correlation
Digital filters
Adaptive filters
Motion estimation
Discrete cosine transform (DCT)
Vector quantization (VQ)
Viterbi algorithm and dynamic programming
Decimator and expander
Wavelets and filter banks

DSP in VLSI Design Shao-Yi Chien 21
Convolution (1/2)
Can be used to describe the behavior of a
linear time-invariant systems
x(n): input signal
y(n): output signal
h(n): unit-sample response

DSP in VLSI Design Shao-Yi Chien 22
Convolution (2/2)
Finite impulse response (FIR) system
Infinite impulse response (IIR) system

DSP in VLSI Design Shao-Yi Chien 23
Digital Filters
LTI, causal filter
M-tap finite impulse response filter

DSP in VLSI Design Shao-Yi Chien 24

DSP in VLSI Design Shao-Yi Chien 25
Moore’s Law

DSP in VLSI Design Shao-Yi Chien 26
Scaled CMOS technology
(Moore’s Law) (1/3)

DSP in VLSI Design Shao-Yi Chien 27
Scaled CMOS technology
(Moore’s Law) (2/3)
18016014012010080
Maximum power for high
performance with heat sink (W)
0.91.21.51.82.53.3
Power supply voltage (V) for
desktop
7-86-765-654-5
Maximum number of wiring levels
(logic), on chip
1,1001,000800600450300
Chip frequency (MHz) for a
high-performance on-chip clock
430M210M50M26M14M5MASIC (gate per chip)
800M350M150M64M28M12M
Microprocessor transistor per chip
( 2.3 times per generation)
64G16G4G1G256M64M
Memory in bits/chip
(DRAM/FLASH)
0.070.100.130.180.250.35Minimum feature of size (um)
201020072004200119981995Year of first DRAM shipment
Source: SIA (Semiconductor Industry Association) road map
ITRS: International Technology Roadmap for Semiconductors
http://www.itrs.net/

DSP in VLSI Design
Scaled CMOS technology
(Moore’s Law) (3/3)
Shao-Yi Chien 28

DSP in VLSI Design Shao-Yi Chien 29
DSP and VLSI
Modern DSP
Well suite to VLSI implementation
Feasible or economically viable only if implemented
using VLSI technologies
VLSI
Large investmentneed large volume of products
Communication
Consumer applications
Necessary performance requirement (especially real-
time requirement)
DSP systems are hard real-time systems

DSP in VLSI Design Shao-Yi Chien 30
Example: the Chip for PS2
’98
239mm
2
@0.25um
’98
279mm
2
@0.25um
’99
110mm
2
@0.18um
’01
75mm
2
@0.14um
’99
108mm
2
@0.18um
’02
77mm
2
@0.16um
Emotion Engine
Graphics Synthesizer
’04
87mm
2
@90nm
-----------------------------------
Slide No.20
Ken Kutaragi
ISSCC2006 Plenary Talk
Feb 6, 2006 SF, CA
----------------------------------------
Ref: K. Kutaragi, The Future of Computing for "Real-Time Entertainment," ISSCC2006

DSP in VLSI Design Shao-Yi Chien 31
Problems: Interconnection

DSP in VLSI Design Shao-Yi Chien 32
Problems: Increasing Static Power
V
DDdecreases
Save dynamic power
Protect thin gate oxides and short channels
No point in high value because of velocity sat.
V
tmust decrease to
maintain device performance
But this causes exponential
increase in OFF leakage
Major future challenge
Static
Dynamic
[Moore03]

DSP in VLSI Design Shao-Yi Chien 33

DSP in VLSI Design Shao-Yi Chien 34
Problems: Power Density
Intel VP Patrick Gelsinger (ISSCC 2001)
If scaling continues at present pace, by 2005, high
speed processors would have power density of
nuclear reactor, by 2010, a rocket nozzle, and by
2015, surface of sun.
“Business as usual will not work in the future.”
Intel stock dropped 8%
on the next day
But attention to power is
increasing
[Gelsinger01]

DSP in VLSI Design Shao-Yi Chien 3519501960197019801990200020102020
Module Heat Flux (W/cm
2
)
Year of Announcement
Ultra-low V
dd
3-D integration ?
bipolar
CMOS
Problems: Scaling and the Power
Crisis
After: R. Schmidt et al., IBM J. R&D, (2002).
Delay 3-4X 
Power 15X 
Density 50X 

DSP in VLSI Design
FinFET
Shao-Yi Chien 36

DSP in VLSI Design
FinFET
Shao-Yi Chien 37

DSP in VLSI Design
FinFET
Shao-Yi Chien 38

DSP in VLSI Design Shao-Yi Chien 39
Science and Technology Strategy / Roadmap
2000 2005 2010 2015 2020 2025 2030
Plan B: Subsytem Integration
R D
Plan C: Post Si CMOS Options
R R&D
Plan Q:
R D
Quantum Computing
Plan A: Extending Si CMOS
R D
Ref: Tze-Chiang Chen, "Where Si-CMOS is Going: Trendy Hype vs. Real Technology," ISSCC2006.

DSP in VLSI Design Shao-Yi Chien 40
More Moore & More than Moore !!!

DSP in VLSI Design
2.5D Interposer
Shao-Yi Chien 41

DSP in VLSI Design
3D-IC Technology
Shao-Yi Chien 42

DSP in VLSI Design
Heterogeneous System Integration
Shao-Yi Chien 43
[TSMC 2015]

DSP in VLSI Design
InFO (Integrated Fan Out)
Shao-Yi Chien 44
[TSMC 2015]

DSP in VLSI Design
CoWoS (Chip-On-Wafer-On-Substrate)
Shao-Yi Chien 45
[TSMC 2015]

DSP in VLSI Design
Example of CoWoS
Shao-Yi Chien 46
[TSMC 2015]

DSP in VLSI Design Shao-Yi Chien 47
DSP Architecture Design?
Given DSP algorithms, find the “best”
solution in the design space under certain
constraints
Or, modified or develop the algorithm to be
“hardware oriented” or “hardware friendly,”
and then develop the hardware
architecture

DSP in VLSI Design Shao-Yi Chien 48
Abstraction Layers
System (ex: MP3 player)
Algorithm (ex: FIR filter)
Hardware architecture (ex: array architecture,…)
Arithmetic units (ex: multiplier, adder, …)
Logic gates (ex: AND, OR, …)
Transistors (ex: NMOS, PMOS)
Layout

DSP in VLSI Design Shao-Yi Chien 49
The Higher the Abstraction,
The Larger Design Space
System Level
Algorithm Level
Hardware Architecture Level
Arithmetic Level
Gates Level
Transistors Level
Physical Level
Design Space

DSP in VLSI Design Shao-Yi Chien 50
The Higher the Abstraction,
The More Important
System Level
Algorithm Level
Hardware Architecture Level
Arithmetic Level
Gates Level
Transistors Level
Physical Level
Performance Space

DSP in VLSI Design Shao-Yi Chien 51
Representations of DSP
Algorithms
DSP algorithms: nonterminating program
Iteration period
Sampling rate
Latency
Throughput
Clock frequency
Critical path

DSP in VLSI Design Shao-Yi Chien 52
Graphical Representations of
DSP Algorithms
Can bridge the gap between algorithmic
descriptions and structural implementations
Block diagram
Signal-flow graph (SFG)
Data-flow graph (DFG)
Dependence graph (DG)

DSP in VLSI Design Shao-Yi Chien 53
Block Diagram (1/5)
The most frequently used representation
Can be constructed with different levels of
abstraction
Can be directly mapped to circuits
implementation

DSP in VLSI Design Shao-Yi Chien 54
Block Diagram (2/5)

DSP in VLSI Design Shao-Yi Chien 55
Block Diagram (3/5)
Data broadcast FIR filter

DSP in VLSI Design Shao-Yi Chien 56
Block Diagram (4/5)
(4ns)
(1ns)
Critical path: 4+1+1=6ns
Max clock frequency = 1s/6ns=167MHz

DSP in VLSI Design Shao-Yi Chien 57
Block Diagram (5/5)
Critical path: 4+1=5ns
Max clock frequency = 1s/5ns=200MHz

DSP in VLSI Design Shao-Yi Chien 58
Signal Flow Graph (SFG) (1/4)
Nodes k
Computation or task
Directed edges (j, k)
Linear transformation
Source node
Sink node

DSP in VLSI Design Shao-Yi Chien 59
Signal Flow Graph (SFG) (2/4)

DSP in VLSI Design Shao-Yi Chien 60
Signal Flow Graph (SFG) (3/4)
Transpose
property

DSP in VLSI Design Shao-Yi Chien 61
Signal Flow Graph (SFG) (4/4)
Used in digital filter structure and analysis
of finite word-length effects
Only applicable to linear networks
Cannot be used to describe multi-rate
DSP systems

DSP in VLSI Design Shao-Yi Chien 62
Data-Flow Graph (DFG) (1/4)
Nodes
Computations
Directed edges
Data paths (communication)
Has a nonnegative number of delays

DSP in VLSI Design Shao-Yi Chien 63
Data-Flow Graph (DFG) (2/4)
y(n)=x(n)+ay(n-1)
A: +
B: X
Execution time
Synchronous DFG
Rate

DSP in VLSI Design Shao-Yi Chien 64
Data-Flow Graph (DFG) (3/4)
Data-driven property of DSP
Any node can fire whenever all the input data
are available
Intra-iteration precedence constraint
Inter-iteration precedence constraint
Can be used to describe both linear
single-rate and nonlinear multi-rate DSP
systems

DSP in VLSI Design Shao-Yi Chien 65
Data-Flow Graph (DFG) (4/4)
Use single rate DFG (SRDFG) to represent multi-rate
DFG (MRDFG)
3f
A=5f
B
2f
B=3f
C

DSP in VLSI Design Shao-Yi Chien 66
Dependence Graph (1/2)
A directed graph that shows the
dependence of the computation
Node: computation
No node in a DG is ever reused on a
single computation basis
Single-assignment representation
Used for systolic-array design

DSP in VLSI Design Shao-Yi Chien 67
Dependence Graph (2/2)

DSP in VLSI Design Shao-Yi Chien 68
DFG v.s. DG
DFG
Nodes only cover
computation in one
iteration, and will be
reused iteratively
Contain delay
elements
DG
Contains computation
for all iterations, and is
used only once
No delay elements
contained
Tags