0-Introduction compiler, phases of compiler .ppt

arslanahmadkhan295 5 views 27 slides Oct 23, 2025
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

in the presentation we have introduced the compiler and how to its work, the phases of compiler


Slide Content

Compiler

Construction
1

Compiler
A Compiler translate information from One
Language to Another, Say from English to Urdu or
similar.
As this course is related to Computers , so we will
keep in mind computer languages.
A compiler translates from High level language to
Machine Language.
OR
A translator that translate source program code
into target machine code.
E
n
g
r
.S
h
a
u
k
a
t

A
l
i
(
0
3
4
5
-
9
5
3
0
4
6
8
)
2

Compiler
If the target program is an executable machine-
language program, it can then be called by the
user to process inputs and produce outputs
3
COMPILER
Source
Program
Target
Program
TAGET PROGRAMINPUT OUTPUT

Phases of Compiler
A Compiler Basically operates into six Phases
Each Phase is a Logical Activity that transform the
source Program from one representation into another.

4

Phases of a Compiler 5
Source Program
Lexical Analyzer
1
Syntax Analyzer
2
Semantic Analyzer
3
Intermediate Code
Generator
4
Code Optimizer
5
Code Generator
6
Target Program
Symbol-table
Manager
Error Handler

1. Lexical Analysis
It is the first phase of compiler
It is also called linear analyzer or scanner
It takes the source code program as a
stream of characters and reads it from
Left-to-Right and broken them into Tokens
Token: It is sequence of characters
having collective meanings (words in
natural language)
6

Token VS Lexeme
LEXEME: A lexeme is a sequence of characters in the
source program that matches the pattern for a
token. It is the raw text from the code that has been
identified as a single logical unit.
E.g sum , result, person etc
TOKEN: A token is a pair, typically consisting of a
token name (or type) and an optional attribute
value. The token name is an abstract symbol
representing a kind of lexical unit (e.g., keyword,
identifier, operator).
7

1. Lexical Analysis 8
Position := initial + rate * 60 ;______ __ ____ _ ___ _ __ _
id1 := Id2 + Id3 * 60
All are Lexemes Which are
converted to tokens and then passed
to next phase

2. Syntax Analyzer
It is the 2nd phase of compiler
It is also called Parser or Hierarchical Analyzer
It takes the stream of tokens as input and generate
parse tree
9

2. Syntax Analyzer
The hierarchical structure of a program is usually expressed by
recursive rules
Example:
Following are the rules for the definition of expression
1.An identifier is an expression
2.Any number is an expression
3.If expression
1
and expression
2
are expressions then so are :
Expression
1 + expression
2
Expression
1 * expression
2
(Expression
1
)
Thus by rule 1 initial and rate are expressions. By rule 2 60 is an
expression while by rule 3 we can first infer that rate * 60 is an
expression and finally that initial + rate * 60 is an expression.
10

2. Syntax Analyzer 11
identifier
identifier
expression
identifier
expression
number
expression
expression
expression
Assignment
Statement
position
:=
+
*
60
initial
rate
Position := initial + rate * 60

3. Semantic Analyzer
It is the 3rd phase of the compiler
This phase checks the source program for semantic errors
also called meanings. i.e. the operation applied in the source
program are meaningful or not
It uses the hierarchical structure determined by the syntax
analysis phase to identify the operators and operands of
expressions and statements
Its main function is the type checking i.e. Here compiler
checks that each operator has operands that are permitted
by the source language specification
12

3. Semantic Analyzer
Example:
Many programming languages definitions require a compiler to
report an error every time a real number is used to index an array
However, the language specification may permit some operand
coercions. e.g.
When a binary arithmetic operator is applied to an integer and real. In
this case the compiler may need to convert the integer to a real.
E
n
g
r
.S
h
a
u
k
a
t

A
l
i
(
0
3
4
5
-
9
5
3
0
4
6
8
)
13

3. Semantic Analyzer
Example:
Inside a machine the bit pattern representing an integer is
generally different from the bit pattern for a real. Even if the
integer and the real number happen to have the same value
Suppose that all identifiers in figure have been declared to
be real and that 60 by itself is assumed to be an integer.
Type checking of Figure1 show that * is applied to a real
rate and an integer 60
The general approach is to convert the integer to a real. This
has been achieved in Figure2 by creating an extra node for
the operator inttoreal that explicitly converts an integer into a
real
E
n
g
r
.S
h
a
u
k
a
t

A
l
i
(
0
3
4
5
-
9
5
3
0
4
6
8
)
14

3. Semantic Analyzer
E
n
g
r
.S
h
a
u
k
a
t

A
l
i
(
0
3
4
5
-
9
5
3
0
4
6
8
)
15
position
initial
rate
:=
+
*
60
Compressed Tree
Figure 1
position
initial
rate
:=
+
*
inttoreal
60
Conversion Action
Figure 2

3. Semantic Analyzer 16
Position := initial + rate * 60
:=
id1
id2
id3
+
*
inttoreal
60
Abstract Tree

4. Intermediate Code Generator
ICG is a program for an Abstract Machine
This code should have two properties
It should be easy to generate
It should be easy to translate into the target program
It can be represented in many form. One of them is called “Three
Address Code” which means that every statement of intermediate
Code can have:
At most Three Operands (Variables) and
At most One Operator in addition to Assignment Operator
It also checks the precedence of Operator
ICG generates temporary variables to hold the result of each
operation
17

4. Intermediate Code Generator
For the parse tree given in Semantic Analysis phase the
intermediate code can be generated as:
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
18
3 address code

5. Code Optimizer
The Code Optimizer improve the intermediate code, so that
fast running machine code is achieved
The intermediate code given below:
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
Can be optimized by Code Optimizer as:
temp1 =: id3 * 60.0
id1 =: id2 + temp1
19

6. Code Generation
It is the final phase of compiler which generate either
re-locatable machine code or assembly language code
The next job of the code generator is to select memory
locations for each and every variable used by the
program
Then intermediate instructions are each translated into
a sequence of machine instructions that perform the
same task
For example using register 1 and 2 the translation of
the optimized code into respective code is given below:
20

6. Code Generation
MOVF id3, R2
MULF #60.0, R2
MOVF id2, R1
ADDF R2 , R1
MOVF R1 , id1
The first operand of each instruction specifies a source and
second operand specifies destination. The F in each instruction
tells us that it deals with floating-point numbers.
21

6. Code Generation
The code loads the contents of address id3 into register R2, then
multiplies it with floating-point constant 60.0. The # signifies that
60.0 is to be treated as an immediate constant. The third
instruction moves id2 into register R1 and the fourth adds to it the
value previously computed in register R2. Finally, the value in
register R1 is stored into the address of id1 , so the code correctly
implements the assignment statement
22

Reviewing the Entire Process23
E
R
R
O
R
S
position := initial + rate * 60
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
id1 := id2 + id3 * 60
:=
id1
id2
id3
+
*
60
:=
id1
id2
id3
+
*
inttoreal
60
Symbol Table
position ....
initial ….
rate….

Reviewing the Entire Process24
E
R
R
O
R
S
Intermediate Code Generator
Code Optimizer
Code Generator
temp1 := inttoreal(60)
temp2 := id3 * temp1
temp3 := id2 + temp2
id1 := temp3
temp1 := id3 * 60.0
id1 := id2 + temp1
position ....
initial ….
rate….
Symbol Table
3 Address Code

Symbol Table
This is the informal phase of compiler.
It is a data structure, containing information about identifiers
(variables) used in the source program in the form of various
attributes.
These attribute provide the information about :
Storage allocated
Type
Scope (visibility)
Procedure names (Function names).
In case of procedure names the attribute may be:
Arguments and return type of functions
The names, types and number of arguments.
The methods of passing arguments. i.e. either by
reference or by value
25

Error Handler
Error
An Error is an abnormal condition in the source program which
either stops the compilation or results in undesired output.
Error Handler:
The two basic task of compiler are
Error Detection
Error Recovery
During the compilation process each phase may encounter
errors so error handler is used to handle these errors
Error handler contains error handling routines
26

27
Tags