Pipeline Hazards
•There are situations, called hazards, that prevent the
next instruction in the instruction stream from
executing during its designated cycle
•There are three classes of hazards
•Structural hazard
•Data hazard
•Branch hazard
Structural Hazards. They arise from resource conflicts when the hardware cannot support all
possible combinations of instructions in simultaneous overlapped execution.
Data Hazards. They arise when an instruction depends on the result of a previous instruction
in a way that is exposed by the overlapping of instructions in the pipeline.
Control Hazards.They arise from the pipelining of branches and other instructions
that change the PC.
What Makes Pipelining Hard?
•Power failing,
•Arithmetic overflow,
•I/O device request,
•OS call,
•Page fault
3
Pipeline Hazards
•Structural hazard
•Resource conflicts when the hardware cannot support all possible
combination of instructions simultaneously
•Data hazard
•An instruction depends on the results of a previous instruction
•Branch hazard
•Instructions that change the PC
Structural Hazard
•This is the situation when two instructions require the use of a given
hardware resource at the same time.
•Arises from resource conflicts when the hardware can’t support all
possible combinations of overlapping instructions.
•Root Cause: resource conflicts
e.g., a processor with 1 reg write portbut having two instructions to
write in register
• Solution
stall one of the instructions until required unit is available
5
Structural Hazard
•Example
same resource i.e
mem conflict
data access
vs
instr fetch
Load
Instr i+3
Instr i+2
Instr i+1
MEM
IF
Structural hazard
•To solve this hazard, we “stall” the pipeline
until the resource is freed
•A stall is commonly called pipeline bubble,
since it floats through the pipeline taking space
but carry no useful work
How isit resolved?
Time
RegMem Reg
Load
Instruction 1
Instruction 2
Stall
Instruction 3
Bubbl
e
Bubbl
e
Bubbl
e
Bubbl
e
Bubbl
e
Pipeline generally stalled
byinserting a “bubble” or
NOP
RegMem MEM Reg
RegMem Reg
RegMem Reg
MEM
MEM
MEM
Structural Hazard Solution
Stall Instr i+3
till CC 5
10
Structural Hazards
Stalling instruction i+3 to start from clock cycle 5
Data Hazards
•A data hazard occur if either the source or the destination
operands of an instruction are not available at the time
expected in the pipeline. As a result some operations has to be
delayed and the pipeline stalls.
•A data hazard is a situation in which the pipeline is stalled
because the data to be operated on are delayed for some
reason.
11
Data hazard
Example:
ADD R1,R2,R3 R1R2+R3
SUBR4,R1,R5 R4R1-R5
AND R6,R1,R7 R6R1 AND R7
OR R8,R1,R9 R8R1 OR R9
XOR R10,R1,R11 R10R1 XOR R11
ADD writes the register in WB but SUB needs it in ID.
This is a data hazard
All instructions
after ADD use result
of ADD
Data hazard
•Delay load approach inserts a no-operation instruction to avoid the
data conflict
ADD R1R2+R3
No-op
No-op
SUB R4R1-R5
AND R6R1 AND R7
OR R8R1 OR R9
XOR R10R1 XOR R11
Data hazard
Data hazard
•It can be further solved by a simple hardware technique called forwarding (also
called bypassing or short-circuiting)
•The insight in forwarding is that the result is not really needed by SUB until the
ADD execute completely
•If the forwarding hardware detects that the previous ALU operation has written the
register corresponding to a source for the current ALU operation, control logic
selects the results in ALU instead of from memory
Forwarding
•Problem illustrated on previous slide can actually be solved
relatively easily – with forwarding
•Can we move the result from EX/MEM register to the beginning of
ALU (where SUB needs it)?
–Yes!
•Generally speaking:
– Forwarding occurs when a result is passed directly to functional
unit that requires it.
–Result goes from output of one unit to input of another
When can weforward?
RegMem DM Reg
RegMem DM Reg
RegMem DM
RegMem
Time
ADD R1, R2, R3
SUB R4, R1, R5
AND R6, R1, R7
OR R8, R1, R9
XOR R10, R1, R11
RegMem
SUB gets info.
from EX/MEM
pipe register
1
7
AND gets info.
from MEM/WB
pipe register
OR gets info. by
forwarding from
register file
Rule of thumb: If line goes “forward” you can do forwarding.
If its drawn backward, it’s physically impossible.
Data hazard
19
Data Hazard Classification
•Three types of data hazards
–RAW : Read After Write
–WAW : Write After Write
–WAR : Write After Read
•RAR : Read After Read
–Is this a hazard?
Read After Write (RAW)
•A read after write (RAW) data hazard refers to a
situation where an instruction refers to a result
that has not yet been calculated or retrieved.
•This can occur because even though an
instruction is executed after a previous
instruction, the previous instruction has not
been completely processed through the pipeline.
example:
i1. R2 <- R1 + R3
i2. R4 <- R2 + R3
Write After Read (WAR)
•A write after read (WAR) data hazard
represents a problem with concurrent
execution.
For example:
i1. R4 <- R1 + R5
i2. R5 <- R1 + R2
Write After Write (WAW
•A write after write (WAW) data hazard may occur
in a
concurrent execution environment.
example:
i1. R2 <- R4 + R7
i2. R2 <- R1 + R3
We must delay the WB (Write Back) of i2 until the execution of i1
Control Hazards (Effect of Branching)
•Control hazard occurs whenever the pipeline makes an incorrect branch
prediction decisions, resulting in instructions entering the pipeline that
needs to be discarded.
•These result from the pipelining of branches and other instructions that
change the Program Counter.
•The action of fetching a non sequential or remote instruction after a branch
instruction is called branch taken.
•The instruction to be executed after a branch taken is called a branch
target.
•When a branch taken occurs all the instructions following the branch in the
pipeline becomes useless and will be drained from the pipeline.
23
Branch hazards
•Branch hazards can cause a greater performance loss for
pipelines
•When a branch instruction is executed, it may or may not
change the PC
•If a branch changes the PC to its target address, it is a
taken branch Otherwise, it is untaken
Control Hazard Example
12: BEQ R
1,
R
3,
36
16: AND R
2,
R
3,
R
5
20: OR R
6,
R
1,
R
7
24: ADD R
8,
R
1,
R
9
36: XOR R
10, R
1, R
11
Here 12,16,20,24,36 are the address of the instructions.
BEQ (Branch if Equal) instructions executes and if R1 and R3 are equal, so jump to instruction whose
memory address is 36.
So, once branch is taken, the next three in-order instructions which entered into the pipeline needs to
discarded. Hence, clock cycles are wasted and pipeline is stalled.
25
•A branch instruction can be conditional or unconditional.
•The branch instruction breaks the normal sequence of the
instruction stream, causing difficulties in the operation of the
instruction pipeline.
Handling Branching Instructions
26
Prefetch target instruction
•Prefetch the target instruction in addition to the instruction
following the branch.
•If the branch condition is successful, the pipeline continues
from the branch target instruction.
28
Loop Buffer
•Very fast memory
•Maintained by fetch stage of pipeline
•Check buffer before fetching from memory
•When a program loop is detected it is stored in the loop buffer in
its entirety including all branches.
•The loop buffer is similar (in principle) to a cache dedicated to
instructions. The differences are that the loop buffer only retains
instructions in sequence, and is much smaller in size (and lower
in cost).
29
Branch target buffer (BTB)
•BTB is an associative memory.
•Each entry in the BTB consists of the address of a previously
executed branch instruction and the target instruction for the
branch.
30
Branch Prediction
•A pipeline with branch prediction uses some additional logic
to guess the outcome of a conditional branch instruction
before it is executed.
31
Branch Prediction
•Various techniques can be used to predict whether a branch will be
taken or not:
–Prediction never taken
–Prediction always taken
–Prediction by opcode
–Branch history table
•The first three approaches are static: they do not depend on the
execution history up to the time of the conditional branch instruction.
The last approach is dynamic: they depend on the execution history.
32
Delayed Branch
•In this procedure, the compiler detects the branch instruction
and rearrange the machine language code sequence by
inserting useful instructions that keep the pipeline operating
without interrupts.
33
Example
•Five instructions need to be carried out:
Load from memory to R1
Increment R2
Add R3 to R4
Subtract R5 from R6
Branch to address X
34