Principal Sources of Optimization in compiler design

19,598 views 47 slides Nov 18, 2023
Slide 1
Slide 1 of 47
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47

About This Presentation

Principal Sources of Optimization in compiler design


Slide Content

Code Optimization Principal Sources of Optimization – Peep-hole optimization – DAG- Optimization of Basic Blocks–Global Data Flow Analysis – Efficient Data Flow Algorithm – Recent trends in Compiler Design

Code optimization program transformation technique, which tries to improve the code by eliminating unnecessary code lines and arranging the statements in a sequence to make it consume less resources (i.e. CPU, Memory) and deliver high speed. Advantages: Executes faster. Efficient memory usage. yields better performance.

Compiler front end Lexical analysis, Syntax analysis, Semantic analysis Tasks : understanding the source code, making sure the source code is written correctly. Code optimization

Compiler back end: Intermediate code generation/improvement and machine code generation/improvement. Tasks: translating the program to a semantically the same program(in a different language). Code optimization

A code optimizer sits between the frontend and the code generated. Works with intermediate code Can do control flow analysis Can do data flow analysis Does transformations to improve the intermediate code. Code optimization is done on basic blocks . Code optimization

Types of code optimization Machine Independent Optimization Improve the intermediate code to get a better target code as output Intermediate code transformed doesn’t involve any CPU registers or absolute memory locations Machine Dependent Optimization Improve the target code according to the target machine architecture Involves CPU registers and may have absolute memory references. Put efforts to take maximum advantage of the memory

Local Optimization Apply to a single block Straight line code Simplest form No need to analyse the whole procedure body Global Optimisation Apply across basic blocks Applied to program segments that includes functions and loops Data flow analysis is done to perform optimisation Peephole Optimization Performed on a small set of compiler generated instructions Operates on the target code considering few instructions at a time Do machine dependent improvements Most compilers do (1), many do (2) and very few do (3) Types of code optimization

Basic blocks: Straight line code with no branches Sequence of intermediate code with a single entry and a single exit. CFGs show control flow among basic blocks Optimization is done on the basic blocks A basic block begins in one of the following ways The entry point into the functions The target of a branch (can be a label) The instruction immediately following a branch Basic blocks and flow graph

A basic block ends in any of the following ways a jump statement a conditional or unconditional branch a return statement. Basic blocks and flow graph Contd …

Properties of basic blocks Property 1: Sequence of consecutive statements executed in a sequential manner. Property 2: One entry point and one exit point Property 3: Should not contain conditional or unconditional control statements in the middle of the block. It may be the first or last statement.

Algorithm for partitioning a three address code into basic block. Rules Determine the leaders Determine the basic blocks. 1. Determine the leaders 1 st statement is a leader Target of a conditional or unconditional statement is a leader Statement following the conditional or unconditional jump is a leader.

Determine the basic block and draw the flow graph for the given 3 address code PROD = 0 I = 1 T2 = addr (A) T4 = addr (B) T1 = 4*I T3 = T2[T1] PROD = PROD+T3 I = I+1 If I<=20 goto (5) j = j+1 k = k+1 If j<=5 goto (7) I = i+j Three address code

2. Determine the basic blocks begin from the leader statement and end in the statement before the next leader. Leaders as per rules – 1,5,7,10,13 PROD = 0 I = 1 B1 T2 = addr (A) T4 = addr (B) T1 = 4*I T3 = T2[T1] B2

7. PROD = PROD+T3 8. I = I+1 B3 9. If I<= 20 goto (5)(B2) 10. j = j+1 11. k = k+1 B4 12. If j<=5 goto (7)(B3) 13. i = i+j B5

Flow graph Directed graph in which flow control information is added to the basic blocks. Basic blocks are the nodes of the graph. B2 B1 B3 B4 B5

Directed Acyclic Graph(DAG) DAG representation of a basic block represent the structure of a basic block In a DAG, internal node represents operators. Leaf node represents identifiers, constants. Internal node also represents result of expressions. t = a+b t a b +

Application of DAG Determining the common sub expressions. Determining which names are used inside the blocks & computed outside the block. Determining which statement of the block could have their computed value outside the blocks. Simplifying the list of quadruples by eliminating common sub expressions.

( eg ) Construct DAG for the expression a + a*(b-c)+(b-c)*d + + * * a - d b c

a = b + c b = a - d c c = b + c d = a - d - b, d + a d b c + + -

a = b*-c + b*-c a b c = - * +

a = b+c b = b-d c = c+d e = b+c a b c b c d + + + -

d = b * c e = a + b b = b * c a = e – d a e d, b a b c + - *

a = (a* b+c )-(a* b+c ) = a - + * c a b

Optimization of basic blocks 1. Finding Local Common sub expression 2. Dead code elimination. 3. Renaming temporary variables. 4. Interchange of statements. 5. Algebraic transformations.

Peephole Optimization Applied to improve the performance of program by examining a short sequence of instructions in a window(peephole) and replace the instructions by a faster or short sequence of instructions.

Peephole Optimization Techniques Eliminating Redundant Loads and Stores. Eliminating Unreachable Code. Flow of control optimizations. Algebraic Simplifications Machine idioms.

1. Eliminating redundant loads and stores Consider one instruction mov R0, a mov a, R0 eliminate 2 nd instruction since a is already in R0.

2. Eliminating Unreachable Code Eliminate the statements which are unreachable. ( i.e ) never executed. ( eg:1) i = 0; (eg:2) int add( int a, b) if( i ==1) { { i =0 int c= a+b ; sum = 0; return c; } printf (“%d”, c); Eliminate }

3. Flow of Control Optimizations Unnecessary jumps can be eliminated ( eg ) Multiple jumps can make the code inefficient. Above code can replaced goto L1 goto L3 ….. ….. L1: goto L2 L1: goto L3 ….. ….. L2: goto L3 L2: goto L3 ….. L3: Mov a, R0 L3: Mov a, R0

4. Algebraic Simplifications ( eg ) x = x+0 (or) x = x*1 The above statements can be eliminated because by executing those statements the result x won’t change. ( eg ) a = replaced with a = x*x b = y/8 replaced with b = y>>3

5. Use of Machine Idioms It is the process of using powerful features of CPU instructions. ( eg ) auto inc/ dec function can be used to inc/ dec variable. a = a+1 replaced with inc a. b = a-1 replaced with dec a.

Principal Sources of Optimization (or) Semantic-Preserving Transformation Common sub expression elimination. Compile time evaluation Constant folding. Constant propagation. Code movement (or) Code motion. Dead code elimination. Strength reduction.

1. Common Sub expression Elimination Common sub expression an expression(E) which appears repeatedly in the program, which is computed previously but the values of variables in expression haven’t changed. replaces the redundant expression each time it is encountered.

Unoptimized code Optimized code a= b+c a= b+c b=a-d b=a-d c= b+c c= b+c d=a-d d=b c=a-d

2. Compile Time Evaluation done at the compile time instead of run time. a. Constant Folding: Evaluate the expression and submit the result. ( eg ) area=(22/7)*r*r 22/7 is calculated and result 3.14 is replaced.

Contd … b. Constant Propagation: propagate the constant through out the expression. Constant replace a variable. ( eg ) pi = 3.14, r = 5 area = pi*r*r area = 3.14*5*5

3. Code Movement (or) Code Motion moves the code outside the loop if it won’t have any difference if it executes inside or outside the loop. Unoptimized Code Optimized Code for( i =0; i <n; i ++) x = y+3 { for( i =0; i <n; i ++) x = y+3; { a[ i ] = 6* i ; a[ i ] = 6* i ; } }

4. Dead Code Elimination Includes elimination of those statements which are never executed or if executes output is never used. ( eg ) Unoptimized Code Optimized Code i =0 i =0 if( i == i ) { a= x+i ; }

Contd … Unoptimized Code Optimized Code int add( int x, int y) int add( int x, int y) { { int z; int z; z= x+y ; z= x+y ; r eturn z; r eturn z; printf (“% d”,z ); } }

5. Reduction in Strength Replacing expensive operator with cheapest operator b = a*2 is replaced with b = a+a

Data Flow A nalysis (DFA) All the optimization techniques depend on data flow analysis. DFA know about how the data is flowing in any control-flow graph. DFA – collects all information about entire program and distribute this information to each block in the flow graph. It helps in identifying the variables that hold values at different points in the program and how these values change over time . This information is used to optimize the program by eliminating dead code, identifying constant expressions, and reducing unnecessary computations .

Data Flow A nalysis Helps in improving the efficiency of the code generated by the compiler Program debugging Program optimization

Global Data Flow A nalysis Collect the data flow information about entire program and distribute the information to each block Data flow is collected using the data flow equation out [s] = gen [s] U(in[s] – kill[s]) out [ s] – Info at the end of S gen [s] - Info generated by S in[s ] - Info enter at the beginning of S kill[s] – Info killed by S i.e information that comes out of S is info generated by S + information coming to the block S and the information killed in the block S

Global Data Flow A nalysis Structured Statement S ->id :=E/S ; S S -> if E then S1 else S2/ do S while E E- > id+id /id

Global Data Flow A nalysis

Efficient Data Flow Algorithm Redundant common sub expression elimination Copy Propagation a :=b y = x z = 3 + y Copy propagation would yield: z = 3 + x Induction Variable The value of the variable changes value every time With each iteration, its value either gets incremented or decremented by some constant value.