Compiler Design Unit 1

3,684 views 43 slides Mar 07, 2022
Slide 1
Slide 1 of 43
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43

About This Presentation

Phases of compiler


Slide Content

UNIT 1
INTRODUCTION
Mrs.D.Jena Catherine Bel,
Assistant Professor, CSE,
Velammal Engineering College

Overview
•Translators
•Need for Compiler
•History
•Analysis-Synthesis model of Compilation
•Phases of Compiler
•Bootstrapping of Compiler
•Passes
•Complier Construction Tools
2
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Translators
•Programs accepting one form of language as input and
produce another form of language as output.
•A few translators are listed below.
•Compiler
•Interpreter
•Assembler
•Preprocessor
•Macro Processor
3
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Translators
Assembler
Translates assembly language program into machine language
Uses mnemonic codes for operation codes and data addresses.
Preprocessor
A translator whose source language and object language are high
level language is termed as preprocessor.
C Preprocessor-begins with #
Macro processor
A system program that replaces the macro calls in an assembly
language program by the macro definition.
During this, formal arguments are converted into the actual
arguments.
4
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Translators
Compiler
Translates a high level language (Source language) into a low
level language (Object language)
COBOL, FORTRAN to Assembly language or Machine
language.
Interpreter
Transforms a programming language into a simplified
language called intermediate code which can be directly
executed.
SNOBOL (String Name Oriented symBOlicLanguage ) to
Polish notation.
5
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Compilers Vs Interpreters
6
•Avoid Compilation for
every Execution
•Requires less time for
compilation
•Reporting errors in a
whole
•Very complex to design
•Produce intermediate
code directly for execution
–can’t reuse
•Requires large time-
reducing the speed of
execution
•Reports error one-by-one
•Easy to design
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Need for Compiler
Machine language:
Media that directly communicate with computer
Error prone
Assembly Language:
•Use mnemonic name for operation codes and data address
•Translator assembler helps it
Still it required how specific computer operates
High level programming:
Express algorithms in more natural notation
Without knowing how specific computer works
Translator compiler helps it
7
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

History of compilers
8
•Early computers written in Assembly language
•1950 -Machine –independent programming
language were introduced -leads to the
development of compilers
•1952 -first compiler was written by Grace Hopper
for A-0 Programming languages
•Translation is viewed as the compilation of a
sequence of machine language subprograms
selected from library
•1957 -first complete compiler introduced by
FORTRAN team led by John Backusat IBM
•1960 –COBOL compiled on multiple architectures
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

History of compilers
•1962 –first self-hosting compiler –compiling its
own source code by Tim Hart and Mike Levinat
MIT for Lisp
•1970 –Pascal and C Compilers written in their
own languages
•Building self-hosting compiler creates bootstrapping
problem –avoid by compiling either by a compiler in
different languages or running the compiler in different
languages
9
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

History of compilers
•1984 –GNU C Compiler by Richard Stallman
•For Unix Operating system
•1987 –first release of GCC compiler
•First portable ANSI C compiler as free software
•1992 –version 2.0
•Ability to compile C++
•1997 –compiler EGCS was created –
•to improve optimization
•2001 –version 3.0
10
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

History of compilers
•1995 –java compiler was introduced
•Major java compilers
•javac, included in JDK -open-sourced since nov 2006.
•GCJ-a part of gcc which compiles C, Fortran, Pascal and
other programming languages .
•ECJ-the Eclipse Compiler for Java, is an open source used
by the Eclipse JDT based on IBM Visual Age's Java compiler.

11
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

What is a Compiler?
•A compiler –translate human
oriented programming
language(source program) to
computer oriented machine
language (target program).
•A source program/codeis a
program/code written in the
source language, which is usually a
high-level language.
•A target program/codeis a
program/code written in the target
language, which often is a machine
language or an intermediate code.
12
compiler
Source
program
Target
program
Error
message
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Analysis and Synthesis Model of
Compilation
13
Two parts of compilation
•Analysis
•breaks up the source program into
constituent pieces (words, phrases)
•creates anintermediate representationof
the source program.
•Synthesis
•performs code optimization,
•constructs the desired target program from
the intermediate representation.
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Analysis and Synthesis Model of
Compilation
14
In Analysis
From the source
program generate three-
address code via tree
(An intermediate
representation)
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Analysis and Synthesis Model of
Compilation
Synthesis
From intermediate
code –to sequence of
target instructions
15
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Tools that Use the Analysis-
Synthesis Model
•Editors(syntax highlighting)
•Uses sequences of command to build the program
•Analyze the program text –and put them in
appropriate hierarchical structure
•Pretty printers(e.g. Doxygen)
•Analyze the program and print them in clearly visible
format
•Comments, indents
16
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Tools that Use the Analysis-
Synthesis Model
•Static checkers(e.g. Lint and Splint)
•Analyze and identify the errors without running the
program
•Text formatters(e.g. TeX and LaTeX)
•Read input as stream of characters, tables, figures
•Specify a hierarchy of boxes to be filled by some bit pattern
17
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Tools that Use the Analysis-
Synthesis Model
•Silicon compilers (e.g. VHDL)
•Input is conventional programming language
•Variable represents o’s or 1’s or signal
•Output is circuit
•Query interpreters/compilers (Databases)
•Translate Boolean and relational operators into
commands that enable search in databases
18
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

The phases of a compiler
•Six phases
•Scanner
•Parser
•Semantic Analyzer
•Intermediate Code Generator
•Code Optimizer
•Code generator
•Three auxiliary components
•Literal table
•Symbol table
•Error Handler
19
Compilation process divided into six sub process –
Phases
Take one form of representation of source program and
produce output as another form
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

The phases of a compiler
20
Scanner
Parser
Semantics Analyzer
Intermediate Code Generation
Code Optimizer
Code Generator
Literal
Table
Symbol
Table
Error
Handler
Source code
Syntax Tree
Annotated Tree
Intermediate code
Optimized code
Target code
Tokens
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

The phases of a compiler-
Scanner
Lexical analysis/Scanner:
it collects sequences of characters into meaningful
units called tokens
Regular expressions are useful notations to recognizes
tokens
Install class of a strings type token in symbol table and
return the token identified to the parser having token
type and the pointer to the symbol table
21
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

An example: a[index]=4+2
a identifier
[ left bracket
index identifier
] right bracket
= assignment
4 number
+ plus sign
2 number
Other operations: it may enter literals into the literal
table
22
The phases of a compiler-
Scanner
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•Syntax analysis/Parser:
•it determines the structure of the program
•Accepts tokens –verify the tokens with patterns that
permitted by the specific language
•The results of syntax analysis are a parse tree or a
syntax tree
•An example: a[index]=4+2;
•Parse tree
•Syntax tree ( abstract syntax tree)
23
The phases of a compiler-Parser
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

The Syntax Tree
24
a index 4 2
identifier identifier number number
subscript-expression additive-expression
Assign-expression
The phases of a compiler-Parser
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

The semantics of a program are its “meaning”,
as opposed to its syntax, or structure, that
determines some of its running time behaviors prior
to execution.
Static semantics: declarationsand type checking
Attributes: The extra pieces of information
computed by semantic analyzer
An example: a[index]=4+2
The syntax tree annotated with attributes
25
The phases of a compiler –Semantic
Analyzer
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

The Annotated Syntax Tree
26a index 4 2
identifier identifier number number
subscript-expression additive-expression
integer integer
Assign-expression
array of integer integer integer integer
The phases of a compiler-Semantic
Analyzer
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•An intermediate code generator
•takes a parse tree from the semantic analyzer
•generates a program in the intermediate language.
•In some compilers, a source program is translated into an
intermediate code first and then the intermediate code is
translated into the target language.
•In other compilers, a source program is translated directly
into the target language.
27
The phases of a compiler-Intermediate
Code Generation
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•Using intermediate code is beneficial when compilers
which translates a single source language to many
target languages are required.
•One of the popular intermediate code is three-
address code. A three-address code instruction is in
the form of x= y op z.
28
The phases of a compiler-Intermediate
Code Generation
t = 4 + 2
a[index]=t
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•Replacing an inefficient sequence of instructions with a
better sequence of instructions.
•Sometimes called code improvement.
•Code optimization can be done:
•after semantic analyzing
•performed on a parse tree
•after intermediate code generation
•performed on a intermediate code
•after code generation
•performed on a target code
29
The phases of a compiler-
Code Optimizer
An example: a[index]=4+2
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

30
t = 4 + 2
a[index]=t
t = 6
a[index]=t
a[index]=6
The phases of a compiler-
Code Optimizer
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•It takes the intermediate code or IR and generates
code for target machine
•The properties of the target machinebecome the
major factor:
•Using instructions and representationof data
•An example: a[index]=4+2
•Code sequencein a hypothetical assembly language
31
The phases of a compiler-
Code Generation
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

A possible code sequence
32
a[index]=6
MOV R0, index
MUL R0,2
MOV R1,&a
ADD R1,R0
MOV *R1,6
The phases of a compiler-
Code Generation
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Static (or compile-time) errors must be reported
by a compiler
Generate meaningful error messages and resume
compilation after each error
Each phase of a compiler needs different kind of error
handing
Exception handling
Generate extra code to perform suitable runtime tests
to guarantee all such errors to cause an appropriate
event during execution.
33
BACK
Auxiliary components-
Error Handling
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•Identifiers are namesof variables, constants, functions, data
types, etc.
•Store information associated with identifiers
•Information associated with different types of identifiers can be
different
•Information associated with variables are name, type,
address,size (for array), etc.
•Information associated with functions are name,type of return
value, parameters, address, etc.
34
Auxiliary components-
Symbol Table
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Auxiliary components-Symbol
Table
•Accessed in every phase of compilers
•The scanner, parser, and semantic analyzer put names of
identifiers in symbol table.
•The semantic analyzer stores more information (e.g.
data types) in the table.
•The intermediate code generator, code optimizer and
code generator use information in symbol table to
generate appropriate code.
•Mostly use hash table for efficiency.
35
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

•Store constants and strings used in program
•reduce the memory size by reusing constants and
strings
•Can be combined with symbol table
36
Auxiliary components-
Literal Table
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Passes
The repetitions to process the entire source program
before generating code are referred as passes.
Passesmay or may not correspond tophases
A pass often consists of several phases
A compiler can be one pass, which results in efficient compilation
but less efficient target code
Most compilerswith optimization use more than one pass
One Pass for scanning and parsing
One Pass for semantic analysis and source-level optimization
The third Pass for code generation and target-level optimization
37
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Bootstrapping
38
•Bootstrapping answers-“How the first compiler (a
translator program) was compiled”.
•Compiler is characterized by
•it’s source language
•Object language
•The language in which the compiler is written.
The notation C
S
LA
refers
the compiler Cwritten in the language Sto compile
the language Lto produce the object language A.
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Bootstrapping
•First Step: Write the compiler for the language L for the
machine A with object language A (C
A
LA
)
•is write a small compiler C
A
SA
that translates a subset S of the
language L into the machine language A.
•This is written in the machine language of the machine A.
•In the second step, write a compiler for L in the language S for
which the compiler is already written. (C
S
LA
)
•In the third step compile the compiler program written in S for L
using the compiler for S.
39
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Bootstrapping
•Using the compiler C
A
LA
it will be possible to write
compiler for any machine.
•A compiler running on one machine, producing object
code for another machine is called Cross Compiler.
40
C
A
SA
C
S
LA
C
A
LA
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Compiler Construction Tools
•The software producing compilers on giving the specification
of the source language and the target machine
•Some of the compiler writing tools are compiler-compilers,
compiler generators, translator writing systems.
•The compiler writing tools need the following information.
•Description of the lexical and syntactic structure of the source
language.
•Description of what output is to be generated for each source
language constructs.
•Description of the target machine.
41
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Compiler Construction Tools
The principal aids provided by the existing compiler –compilers are
Scanner Generator
•Transition diagram is the formal tool used for generating scanners.
•The transition diagrams are constructed using the regular expressions
defined to recognizing the various types of tokens.
Parser Generator
•The parsers are normally non-procedural and written using Context-
free grammars.
•The context free grammars are quite easy to specify the syntactic
structure of any language
•Highly reliable and easy for any correction.

42
Mrs.D.Jena Catherine Bel, AP/CSE, VEC

Compiler Construction Tools
Facilities for code generation
•Code generation is done by writing high level language
routines and calling them at the right time.
•The selection of the routine and generating the code can be
done with the help of decision tables.
43
Mrs.D.Jena Catherine Bel, AP/CSE, VEC