Lexical Analyzer Implementation

284 views 27 slides Jul 16, 2021
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

This ppt is about the implementation of lexical analyzer.


Slide Content

Lexical Analysis - Implementation

Lexical Analysis

Tokens

Tokens

Tokens -> RE Specification of Tokens : The Patterns corresponding to a token are generally specified using a compact notation known as regular expression. Regular expressions of a language are created by combining members of its alphabet. A regular expression r corresponds to a set of strings L(r) where L(r) is called a regular set or a regular language and may be infinite .

RegEx A regular expression is defined as follows :- A basic regular expression a denotes the set {a} where a ∈Σ ; L(a ) ={a} The regular expression ε denotes the set { ε } If r and s are two regular expressions denoting the sets L(r ) and L(s) then; following are some rules for regular expressions

RegEx

RegEx

Token Recognition Finite Automata are recognizers that can identify the tokens occurring in input stream. Finite Automaton (FA) consists of: A finite set of states A set of transitions (or moves) between states: A special start state A set of final or accepting states

Token Recognition

Token Recognition

Token Recognition

Transition Digrams

Input Buffering Buffer Pairs: Due to amount of time taken to process characters and the large number of characters that must be processed. Specialized buffering techniques are developed to reduce the amount of overhead required to process a single input character.

Language to Specify LA Lex , allows one to specify a lexical analyzer by specifying regular expressions to describe patterns for tokens. The input notation for the Lex tool is referred to as the Lex language and the tool itself is the Lex compiler . Behind the scenes , the Lex compiler transforms the input patterns into a transition diagram and generates code , in a file called lex.yy.c , that simulates this transition diagram .

Language to Specify LA Creating LA with Lex:-

Language to Specify LA In simple words:- LEX converts Lex source program to Lexical analyzer. Lexical Analyzer converts input stream into tokens.

Lex Source Program - Structure A Lex program has the following form :- Auxiliary D efinitions – it denotes Res of the form D1 = R1 //D i is the shortcut name for RE D2 = R2 // R i is the RE Translation Rules – Rules that tell LA which action to take upon encountering these tokens.

Lex Source Program - Structure Auxiliary Definitions:- D1 (letter) = A | B | C…| Z | a | b…| z (R1) D2 (digit) = 0 | 1| 2….. | 9 (R2) D3 (identifier) = letter (letter | digit)* D4 (integer) = digit + D5 (sign) = + | - D6 (signed-integer ) = sign integer

Lex Source Program - Structure Translation Rules:- The translation rules each have form: P i { Action i } Each pattern is a regular expression, which may use the regular definitions of the declaration section. The actions are fragments of code , typically written in C . Ex: for ‘keyword’-> begin {return 1} Ex: for ‘variable’-> identifier {install(); return 6}

Lex Source Program - Structure

Lex Source Program - Structure

Lex Source Program - Structure

Implementation of LA Lex generates LA as its o/p by taking Lex program as i /p. Lex program is collection of patterns (REs) and their corresponding actions. Patterns represent tokens to be recognized by LA to be generated. For each pattern, corresponding NFA will be designed.

Implementation of LA There can be ‘n’ no. of NFAs for ‘n’ no. of patterns. A start state is taken & using ε -transitions, all NFAs are combined. The final state of each NFA show that it has found its own token P i . Convert NFA into DFA. The final state of DFA shows the token we have found.

Implementation of LA If none of the states of DFA include any final states of NFA, then an error is reported. If final state of DFA includes more than one final state of NFA, then final state for pattern coming first in transition rule has priority. ************ -----------------************