Principal Sources of Optimization in compiler design
19,598 views
47 slides
Nov 18, 2023
Slide 1 of 47
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
About This Presentation
Principal Sources of Optimization in compiler design
Size: 686.7 KB
Language: en
Added: Nov 18, 2023
Slides: 47 pages
Slide Content
Code Optimization Principal Sources of Optimization – Peep-hole optimization – DAG- Optimization of Basic Blocks–Global Data Flow Analysis – Efficient Data Flow Algorithm – Recent trends in Compiler Design
Code optimization program transformation technique, which tries to improve the code by eliminating unnecessary code lines and arranging the statements in a sequence to make it consume less resources (i.e. CPU, Memory) and deliver high speed. Advantages: Executes faster. Efficient memory usage. yields better performance.
Compiler front end Lexical analysis, Syntax analysis, Semantic analysis Tasks : understanding the source code, making sure the source code is written correctly. Code optimization
Compiler back end: Intermediate code generation/improvement and machine code generation/improvement. Tasks: translating the program to a semantically the same program(in a different language). Code optimization
A code optimizer sits between the frontend and the code generated. Works with intermediate code Can do control flow analysis Can do data flow analysis Does transformations to improve the intermediate code. Code optimization is done on basic blocks . Code optimization
Types of code optimization Machine Independent Optimization Improve the intermediate code to get a better target code as output Intermediate code transformed doesn’t involve any CPU registers or absolute memory locations Machine Dependent Optimization Improve the target code according to the target machine architecture Involves CPU registers and may have absolute memory references. Put efforts to take maximum advantage of the memory
Local Optimization Apply to a single block Straight line code Simplest form No need to analyse the whole procedure body Global Optimisation Apply across basic blocks Applied to program segments that includes functions and loops Data flow analysis is done to perform optimisation Peephole Optimization Performed on a small set of compiler generated instructions Operates on the target code considering few instructions at a time Do machine dependent improvements Most compilers do (1), many do (2) and very few do (3) Types of code optimization
Basic blocks: Straight line code with no branches Sequence of intermediate code with a single entry and a single exit. CFGs show control flow among basic blocks Optimization is done on the basic blocks A basic block begins in one of the following ways The entry point into the functions The target of a branch (can be a label) The instruction immediately following a branch Basic blocks and flow graph
A basic block ends in any of the following ways a jump statement a conditional or unconditional branch a return statement. Basic blocks and flow graph Contd …
Properties of basic blocks Property 1: Sequence of consecutive statements executed in a sequential manner. Property 2: One entry point and one exit point Property 3: Should not contain conditional or unconditional control statements in the middle of the block. It may be the first or last statement.
Algorithm for partitioning a three address code into basic block. Rules Determine the leaders Determine the basic blocks. 1. Determine the leaders 1 st statement is a leader Target of a conditional or unconditional statement is a leader Statement following the conditional or unconditional jump is a leader.
Determine the basic block and draw the flow graph for the given 3 address code PROD = 0 I = 1 T2 = addr (A) T4 = addr (B) T1 = 4*I T3 = T2[T1] PROD = PROD+T3 I = I+1 If I<=20 goto (5) j = j+1 k = k+1 If j<=5 goto (7) I = i+j Three address code
2. Determine the basic blocks begin from the leader statement and end in the statement before the next leader. Leaders as per rules – 1,5,7,10,13 PROD = 0 I = 1 B1 T2 = addr (A) T4 = addr (B) T1 = 4*I T3 = T2[T1] B2
7. PROD = PROD+T3 8. I = I+1 B3 9. If I<= 20 goto (5)(B2) 10. j = j+1 11. k = k+1 B4 12. If j<=5 goto (7)(B3) 13. i = i+j B5
Flow graph Directed graph in which flow control information is added to the basic blocks. Basic blocks are the nodes of the graph. B2 B1 B3 B4 B5
Directed Acyclic Graph(DAG) DAG representation of a basic block represent the structure of a basic block In a DAG, internal node represents operators. Leaf node represents identifiers, constants. Internal node also represents result of expressions. t = a+b t a b +
Application of DAG Determining the common sub expressions. Determining which names are used inside the blocks & computed outside the block. Determining which statement of the block could have their computed value outside the blocks. Simplifying the list of quadruples by eliminating common sub expressions.
( eg ) Construct DAG for the expression a + a*(b-c)+(b-c)*d + + * * a - d b c
a = b + c b = a - d c c = b + c d = a - d - b, d + a d b c + + -
a = b*-c + b*-c a b c = - * +
a = b+c b = b-d c = c+d e = b+c a b c b c d + + + -
d = b * c e = a + b b = b * c a = e – d a e d, b a b c + - *
a = (a* b+c )-(a* b+c ) = a - + * c a b
Optimization of basic blocks 1. Finding Local Common sub expression 2. Dead code elimination. 3. Renaming temporary variables. 4. Interchange of statements. 5. Algebraic transformations.
Peephole Optimization Applied to improve the performance of program by examining a short sequence of instructions in a window(peephole) and replace the instructions by a faster or short sequence of instructions.
Peephole Optimization Techniques Eliminating Redundant Loads and Stores. Eliminating Unreachable Code. Flow of control optimizations. Algebraic Simplifications Machine idioms.
1. Eliminating redundant loads and stores Consider one instruction mov R0, a mov a, R0 eliminate 2 nd instruction since a is already in R0.
2. Eliminating Unreachable Code Eliminate the statements which are unreachable. ( i.e ) never executed. ( eg:1) i = 0; (eg:2) int add( int a, b) if( i ==1) { { i =0 int c= a+b ; sum = 0; return c; } printf (“%d”, c); Eliminate }
3. Flow of Control Optimizations Unnecessary jumps can be eliminated ( eg ) Multiple jumps can make the code inefficient. Above code can replaced goto L1 goto L3 ….. ….. L1: goto L2 L1: goto L3 ….. ….. L2: goto L3 L2: goto L3 ….. L3: Mov a, R0 L3: Mov a, R0
4. Algebraic Simplifications ( eg ) x = x+0 (or) x = x*1 The above statements can be eliminated because by executing those statements the result x won’t change. ( eg ) a = replaced with a = x*x b = y/8 replaced with b = y>>3
5. Use of Machine Idioms It is the process of using powerful features of CPU instructions. ( eg ) auto inc/ dec function can be used to inc/ dec variable. a = a+1 replaced with inc a. b = a-1 replaced with dec a.
Principal Sources of Optimization (or) Semantic-Preserving Transformation Common sub expression elimination. Compile time evaluation Constant folding. Constant propagation. Code movement (or) Code motion. Dead code elimination. Strength reduction.
1. Common Sub expression Elimination Common sub expression an expression(E) which appears repeatedly in the program, which is computed previously but the values of variables in expression haven’t changed. replaces the redundant expression each time it is encountered.
2. Compile Time Evaluation done at the compile time instead of run time. a. Constant Folding: Evaluate the expression and submit the result. ( eg ) area=(22/7)*r*r 22/7 is calculated and result 3.14 is replaced.
Contd … b. Constant Propagation: propagate the constant through out the expression. Constant replace a variable. ( eg ) pi = 3.14, r = 5 area = pi*r*r area = 3.14*5*5
3. Code Movement (or) Code Motion moves the code outside the loop if it won’t have any difference if it executes inside or outside the loop. Unoptimized Code Optimized Code for( i =0; i <n; i ++) x = y+3 { for( i =0; i <n; i ++) x = y+3; { a[ i ] = 6* i ; a[ i ] = 6* i ; } }
4. Dead Code Elimination Includes elimination of those statements which are never executed or if executes output is never used. ( eg ) Unoptimized Code Optimized Code i =0 i =0 if( i == i ) { a= x+i ; }
Contd … Unoptimized Code Optimized Code int add( int x, int y) int add( int x, int y) { { int z; int z; z= x+y ; z= x+y ; r eturn z; r eturn z; printf (“% d”,z ); } }
5. Reduction in Strength Replacing expensive operator with cheapest operator b = a*2 is replaced with b = a+a
Data Flow A nalysis (DFA) All the optimization techniques depend on data flow analysis. DFA know about how the data is flowing in any control-flow graph. DFA – collects all information about entire program and distribute this information to each block in the flow graph. It helps in identifying the variables that hold values at different points in the program and how these values change over time . This information is used to optimize the program by eliminating dead code, identifying constant expressions, and reducing unnecessary computations .
Data Flow A nalysis Helps in improving the efficiency of the code generated by the compiler Program debugging Program optimization
Global Data Flow A nalysis Collect the data flow information about entire program and distribute the information to each block Data flow is collected using the data flow equation out [s] = gen [s] U(in[s] – kill[s]) out [ s] – Info at the end of S gen [s] - Info generated by S in[s ] - Info enter at the beginning of S kill[s] – Info killed by S i.e information that comes out of S is info generated by S + information coming to the block S and the information killed in the block S
Global Data Flow A nalysis Structured Statement S ->id :=E/S ; S S -> if E then S1 else S2/ do S while E E- > id+id /id
Global Data Flow A nalysis
Efficient Data Flow Algorithm Redundant common sub expression elimination Copy Propagation a :=b y = x z = 3 + y Copy propagation would yield: z = 3 + x Induction Variable The value of the variable changes value every time With each iteration, its value either gets incremented or decremented by some constant value.