phases of compiler

SabeehSafdar2 49 views 17 slides Jan 06, 2021
Slide 1
Slide 1 of 17
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17

About This Presentation

Phases of Compiler


Slide Content

Phases of Compiler MUHAMMAD SABEEH SAFDAR 2017-ag-7953 BScs 7th

Topics Phases of Compiler Lexical Analysis What is token? What is stream of token? What is lexeme? What is Pattern? What is valid token? Regular Expression Finite Automata Context free grammar Syntax Analysis Derivation Left-most Derivation Right-most Derivation

Phases of Compiler

Lexical Analysis It is also called scanner. Lexical analysis  is the first phase of a compiler. It takes the modified source code from language preprocessors that are written in the form of sentences. The  lexical analyzer  breaks these syntaxes into a series of tokens. B y removing any whitespace or comments in the source code.

What is Token? What is a token? A lexical token is a sequence of characters that can be treated as a unit in the grammar of the programming languages. Example of tokens: Type token (id, number, real, . . . ) Punctuation tokens (IF, void, return, . . . ) Alphabetic tokens (keywords)

What is Lexeme? A lexeme is a sequence of characters in the source program that is matched by the pattern for a token. Lexeme is a specific instance of a token. Example: x = a + b * 2

What is Pattern? Pattern is a rule describing all those lexemes that can represent a particular token in a source language. Example Identifier: start with an alphabet or _ followed by any alphanumeric character .

What is Valid Token? There are some predefined rules for every  lexeme  to be identified as a  valid token . These rules are defined by grammar rules, by means of a pattern.

Regular Expression Regular expression is a formula that describes a possible set of string. Component of regular expression.. X the characters . any character, usually accept a newline [x y z] any of the characters x, y, z,….. R? a R or nothing (=optionally as R) R* zero or more occurrences….. R+ one or more occurrences…… R1R2 an R1 followed by anR2 R1|R1 either an R1 or anR2.

Finite Automata Finite Automata(FA) is the simplest machine to recognize patterns. The finite automata or finite state machine is an abstract machine which have five elements or tuple. There are two types NFA and DFA

Deterministic Finite Automata ( DFA) In a DFA, for a particular input character, the machine goes to one state only. A transition function is defined on every state for every input symbol. Also in DFA null (or ε) move is not allowed, i.e., DFA cannot change state without any input character.  For example, below DFA with Σ = {0, 1} accepts all strings ending with 0.  One important thing to note is,  there can be many possible DFAs for a pattern . A DFA with minimum number of states is generally preferred

Non-Deterministic Finite Automata (N FA) NFA is similar to DFA except following additional features:  Null (or ε) move is allowed i.e., it can move forward without reading symbols.  Ability to transmit to any number of states for a particular input. As you can see in transition function is for any input including null (or ε), NFA can go to any state number of states.  One important thing to note is,  in NFA, if any path for an input string leads to a final state, then the input string accepted . 

Syntax Analyzer A syntax analyzer or parser takes the input from a lexical analyzer in the form of token streams. The parser analyzes the source code (token stream) against the production rules to detect any errors in the code. The output of this phase is a  parse tree . Why do you need Syntax Analyzer? Check if the code is valid grammatically The syntactical analyzer helps you to apply rules to the code Helps you to make sure that each opening brace has a corresponding closing balance Each declaration has a type and that the type must be exists

Syntax Analyzer

Derivation A derivation is basically a sequence of production rules, in order to get the input string. During parsing, we take two decisions for some sentential form of input: Deciding the non-terminal which is to be replaced. Deciding the production rule, by which, the non-terminal will be replaced. To decide which non-terminal to be replaced with production rule, we can have two options. Left-most Derivation R ight-most Derivation

Left-most Derivation In the left most derivation, the input is scanned and replaced with the production rule from left to right. So in left most derivatives we read the input string from left to right. Example Production Rules S= S+S S=S-S S= a|b|c Input a-b+c Left-most Derivation S= S+S S=S-S+S S=a-S+S S= a-b+S S= a-b+c

Right-most Derivation In the right most derivation, the input is scanned and replaced with the production rule from right to left . So in right most derivatives we read the input string from right to left . Example Production Rules S= S+S S=S-S S= a|b|c Input a-b+c Right-most Derivation S= S+S S=S-S+S S= S-S+c S= S-b+c S= a-b+c