ATFL-Unit-3.pptx for soujanya information

bsoujanya00 0 views 55 slides Oct 15, 2025
Slide 1
Slide 1 of 55
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55

About This Presentation

This ppt is very clear and understand.it is very useful for students and teachers.


Slide Content

Context Free Grammar

Define Grammar: Grammar is a set of rules used to define a valid Sentence in any language. Mathematical representation of grammar: Grammar is defined as a 4-tuple  { V,T,P,S } where V - Finite set of non-terminal/variables Represented in uppercase (states) T - Finite set of terminals/symbols Represented in lower case P - Finite set of production rules. S - start Symbol Non terminals: A,B,C…….. Terminals: a,b,c…………. Production Rules: will be in the form of α  β α = LHS = Single Non terminal β = RHS = E/string & terminal/Non terminal example : A  aB, Production rule Meaning : from state A upon taking input symbol 'a' it moves to state B

Define Left Linear Grammar.  A Grammar is said to be left linear if all of its productions are of the form A  Bx A  x where A and B are non-terminals, x is the string of terminals Define Right Lin ear Grammar.  A Grammar is said to be Right linear if all of its productions are of the form. A  xB A  x where A and B are non-terminal, x is string of terminals.  Regular Grammar is either left linear or Right linear.

Chomsky has classified the grammar is divided into 4 types as follows:  Type 0 is known as unrestricted grammar. Type 1 is known as context-sensitive grammar. Type 2 is known as a context-free grammar. Type 3 is known Regular Grammar. Classification of Grammar

Context Free Grammars A language class larger than the class of regular languages Supports natural, recursive notation called “context-free grammar” Applications: Parse trees, compilers XML A palindrome is a word that reads identical from both ends E.g., madam, redivider , malayalam , But the language of palindromes is a CFL , because it supports recursive substitution (in the form of a CFG)

Context-Free Grammar: Definition A context-free grammar G=(V,T,P,S), where: V: Finite set of non-terminal/variables Represented in uppercase Letters (states) T: Finite set of terminals Represented in lower case P: set of productions, each of which is of the form A  α where A ∈ V, α ∈ (V U T )* Where each  is an arbitrary string of variables or terminals S  start variable CFG for the language of binary palindromes: G=({A},{0,1},P,A) P: A  0 A 0 | 1 A 1 | 0 | 1 |  As we know that a CFG has no context neither left nor right. This is why, it is known as CONTEXT-FREE. Many programming languages have recursive structure that can be defined by CFG’s.

Structure of a production A   1 |  2 | … |  k head body derivation A   1 A   2 A   3 … K. A   k The above is same as:

1 . C onstruct CFG for the language having any number of a’s over the set ∑ = { a } Regular Expression a* Solution :- Regular Expression a* L= { ε , a, aa, aaa ,……..} P: S  aS or S  aS | ε S  ε G = ({S}, {a, ε }, P, S) Problems on Context Free Grammar (CFG)

2 . C onstruct CFG for the language having any number of a’s and any number of b’s over the set ∑ = { a, b } Solution :- Regular Expression ( a + b )* L= { ε , a, b, aa, ab, ba, bb, aaa ,……..} P: S  aS or Productions (or) s  aS | bS | ε S  b S S  ε G = ({S}, { a,b , ε }, P, S)

3 . C onstruct CFG for the language having at least two a’s over the set ∑ = { a,b } Solution :- Regular Expression = ( a + b)* a (a + b)* a (a + b)* L= { a a, aba, abba, baaba……..} P: S  AaAaA A  aA | bA | ε G = ({S,A}, { a,b , ε }, P, S) Atleast 2 means >= 2 = { aa, aaa , aaaa…………..}

External Exam paper 2023-24

5 . construct CFG for a (a + b)* L= { a, ab,aba ,……..} P: S  aA A  aA | bA | ε G = ( {S,A},{ ε , a,b }, P , S ) 6 . C onstruct CFG for a* b* L= { ε , ab,aabb ,……..} P: S  AB A  aA | ε B  bB | ε G = ({S,A,B}, { ε , a, b}, P, S) 7 . construct CFG for 0* 1 (0 + 1)* L= { 1, 01,……..} P: S  A 1 B A  0A | ε B  0B | 1B | ε G = ( {S,A,B},{ ε ,0,1 }, P , S )

8. construct CFG for L = { a n b n | n ≥ 1 } L = { ab, aabb, aaabbb………………….} S  aSb S  ab To derive string aaabbb S  aSb ( S  aSb)  aaSbb ( S  aSb)  aaabbb

The generation of the Language ( set of strings ) using specific rules of the grammar is called a “Derivation”. In Derivation, we use production from start symbol . Derivation for any string is replacement of non-terminals one at a time by its appropriate definitions. There are two types of Derivations 1) Left Most Derivation (LMD) 2) Right Most Derivation (RMD) Derivation In context-free grammar , leftmost and rightmost derivation are  two methods for deriving a string by applying production to a variable in each step . Left Most Derivation :- it is a Derivation in which left most non-terminal is replaced first. Right Most Derivation :- it is Derivation in which Right most non-terminal is replaced first .

1) Derive the strings 1001 and 00101 from the following grammar using leftmost and rightmost derivations. S  A1B A  A | ε B  0 B | 1B | ε LMD = 1001 S  A1B (here A is replaced with ε )  1B  10B  100B  1001B  1001B ( B is replaced with ε )  1001 RMD = 1001 S  A1B  A10B  A100B  A1001B (here B is replaced with ε )  A1001 (here A is replaced with ε )  1001

LMD = 00101 S  A1B  0A1B  00A1B (here A is replaced with ε )  001B  0010B  00101B ( B is replaced with ε )  00101 RMD = 00101 S  A1B  A10B  A101B (here B is replaced with ε )  A101  0A101  00A101  00101

2) Derive the string 1000111 from the following grammar using leftmost and rightmost derivations. S  T00T T  0T | 1T | ε External Exam paper 2023-24

using leftmost and rightmost derivations.

4) Context free grammar given is S  AS| ε A  aa | ab | ba | bb Give leftmost and rightmost derivations for the following strings: aabbba baabab aaabbb

Left-most & Right-most Derivation Styles Derive the string a*(ab+10) from G: 22 E ==> E * E ==> F * E ==> a F * E ==> a * E ==> a * ( E ) ==> a * ( E + E) ==> a * ( F + E) ==> a * ( a F + E) ==> a * ( ab F + E) ==> a * (ab + E ) ==> a * (ab + F ) ==> a * (ab + 1 F ) ==> a * (ab + 10 F ) ==> a * (ab + 10) E = * => G a*(ab+10) Left-most derivation: E ==> E * E ==> E * ( E ) ==> E * (E + E ) ==> E * (E + F ) ==> E * (E + 1 F ) ==> E * (E + 10 F ) ==> E * ( E + 10) ==> E * ( F + 10) ==> E * ( a F + 10) ==> E * ( ab F + 0) ==> E * (ab + 10) ==> F * (ab + 10) ==> a F * (ab + 10) ==> a * (ab + 10) Right-most derivation: G: E => E+E | E*E | (E) | F F => aF | bF | 0F | 1F |  Always substitute leftmost variable Always substitute rightmost variable

Define sentential form. A sentential form is  any string derivable from the start symbol . Thus, in the derivation of a + a * a , E + T * F and E + F * a and F + a * a are all sentential forms. A sentence is a sentential form consisting only of terminals such as a + a * a. What is the right sentential form? A right-sentential form is  a sentential form that occurs in the rightmost derivation of some sentence

Parse Trees Parse trees are an alternative representation to derivations. Define Parse tree: - Parse tree is a graphical representation of Derivation for a given grammar . Constructing Parse Trees: Conditions of a Parse Tree are : 1) A root node is always a start symbol of the grammar. 2) The Derivation is read from left to right 3) The leaf nodes are always terminals interior nodes are always non-terminals . Grammar Parse Tree There can be several parse trees for the same string . • Ideally there should be only one parse tree for each string, i.e., The language is unambiguous.

ambiguous grammars Define ambiguous grammar .  If W is any string in the Language represented by the grammar if W ∈ L (G) has more than one parse tree or more than one left most derivation or more than one right most derivation then such grammar is called “ambiguous Grammar”. A grammar produces more than one parse tree for a sentence (string) is called as an ambiguous grammar. OR

Relation between inference, Derivation, and Parse Trees

Check the following grammar is ambiguous or not for the given string abab S  a | abSb | SAb A  bS | bAAb S  abSb  abab S  SAb  aAb  abSb  abab LMD -1 LMD -2 Solution:- string abab S a b b S S A b S a b S a We got more than one parse tree for left most derivation, So the grammar is ambiguous a

2) Check the following grammar is ambiguous or not for the given string id + id * id E  E + E E  E * E E  id E  E + E  id + E  id + E * E  id + id * E  id + id * id LMD -1 LMD -2 Solution:- string id + id * id E  E * E  E + E * E  id + E * E  id + id * E  id + id * id E E + E E E * id id id E E * E E id id E + id

Use of CNF

Normal Forms : Chomsky Normal Form (CNF) Grieback Normal Form ( GNF) Why normal forms? If all productions of the grammar could be expressed in the same form(s), then: It becomes easy to design algorithms that use the grammar It becomes easy to show proofs and properties

Normal forms for context- Free grammars Define Chomsky normal form ?  A context free grammar is in Chomsky normal form  when every rule is of the form A → BC and A → a , where a is a terminal, and A, B, and C are variables .   we have to perform three simplifications on CFG before conversion of CFG to Chomsky normal form : We must eliminate useless symbols those variables or terminals that do not appear in any derivation of a terminal string from the start symbol . We must eliminate ε productions , those of the form A  ε for some variable A. We must eliminate unit productions , those of the form A  B for variables A and B It is easier to work with a context free language if given a CFG is in normal form . All the grammar are not always optimized that means the grammar may consist of some extra symbols (non-terminal). Having extra symbols, unnecessarily increase the length of grammar. Simplification of grammar means reduction of grammar by removing useless symbols. Symbol means Non-terminals

1) Eliminating Useless Symbols A Symbol (variable) is said to be useful if it is satisfies the following properties 1) Generating property 2) Reachable Property Generating property :- We say the variable x is generating if it directly or indirectly generates a terminal string. S  AB A  a B  b S  AB A  aA | a B  bB S is generating both A and B. A and B directly generating terminal strings. A and B are generating. S is generating . Example: 1 Example: 2 A is generating B is non generating. so S is non generating .

Reachable Property Example: Reachable Symbols Finding reachable symbols  if A is reachable, and A  α , then all variables in α are reachable. S  a A  b Example: A is not reachable

Procedure to eliminate useless symbols:- A Symbol (variable) is said to be useless if 1) it cannot generate or derive a terminal symbol or ( Eliminate the non-generating symbols and all productions involving those symbols ). 2) it cannot Not reachable from starting symbol. NOTE:- you must eliminate non-generating symbols before non-reachable symbols. S  AB | CA A  a B  BC | AB C  aB | b Example:-1 A is generating B is Non generating So eliminate B S  CA A  a C  b Solution:- Eliminate nongenerating symbols and all productions involving such symbols Eliminate unreachable symbols

Useless symbols: example Given CFG: S → AB | a A → b B is not generating, so the grammar is S → a A → b Now A is not reachable, so the grammar is S → a

S  A11B | 11A | 11 A  0 B  BB Eliminate useless symbols for below CFG B is Non-generating Eliminate B. B is useless symbol S  11A | 11 A  0 S  AC | B A  a C  c | BC E  aA | e Example:- 1 Example:- 2 A is generating C is generating E is generating But B is nongenerating, so eliminate B E is Not reachable from starting symbol S  AC A  a C  c Solution:- Solution:-

2) Eliminate ε (epsilon ) productions Define Nullable Variable: a variable is said to be nullable if it either directly or indirectly derives epsilon. Our strategy is to begin by discovering which variables are “nullable”. If A is nullable A  ε , then whenever A appears in a production body, say B  CAD, We make two versions of the production one with A and without A in the body B  CAD | CD A (variable) is nullable if A  ε . In order to eliminate null product , every variable must be replaced with ε . Procedure to Eliminate ε Productions 1) Create two version of the production , one with the nullable variable and one without it . 2 ) Eliminate productions with ε bodies

Basis: If A   is a production in G, then A is nullable (note: A can still have other productions) Induction: If there is a production B  C 1 C 2 …C k , where every C i is nullable, then B is also nullable Algorithm to detect all nullable variables

Removing ɛ- productions: example S → AB A → aAA | ɛ B → bBB | ɛ S, A, and B are all nullable. New grammar: S → AB | A | B A → aAA | aA | a B → bBB | bB | b

S  aAb A  e | ε S indirectly derives epsilon S is nullable A is directly derives epsilon S and A are nullable S  aAb | ab A  e Solution:- Example:- 1 S  XY X  aX | ε Y  bY | ε Solution:- Example:- 2 X and Y directly derives epsilon S indirectly derives epsilon S, X, Y are nullable S  XY | Y | X X  aX | a Y  bY | b Example:- 4 Consider the grammar S  ABC A  BC | a B  bAC | ε C  cAB | ε and eliminate epsilon productions. S  ABC | AB | BC | AC | A | B | C. A  BC | B | C | a B  bAC | bA | bC | b C  cAB | cA | cB | c S and A indirectly derives epsilon S, A are nullable B and C directly derives epsilon B, C are nullable Example:- 3 S  XYX X  aX | ε Y  bY | ε X and Y are nullable S  XYX | XY | XX | YX | Y | X X  aX | a Y  bY | b Solution:-

eliminate unit productions Define unit Production. A production of the form X  Y where X and Y are non-terminals is said to be unit productions. S  XY | X X  aX | a Y  bbY | a | X Example:- 1 S  X and Y  X are the unit productions So eliminate them Replace with the productions in place of non-terminals S  XY | aX | a X  aX | a Y  bbY | a | aX | a To optimize the grammar, we need to eliminate the unit productions If X  Y is a unit production and Y  aX | bY then while removing X  Y production, we should add X  aX | bY Rule  Replace with the productions in place of non-terminals

Procedure to simplify or reduce the Grammar is by Step: 1 eliminate ε (epsilon) productions from the given grammar Step :2 eliminate unit productions from the grammar obtained in step 1 . Step: 3 eliminate the useless symbols from the given obtained in step 2 .

1) Simplify the below Grammar S  AACD A  aAb | ε C aC | a D  aDa | bDb | ε Solution:- Step 1 : Eliminate ε productions Nullable variables are A and D by directly. S indirectly generates epsilon, so nullable. S  AACD | ACD | AAC | CD | AC | C A  aAb | ab C  aC | a D  aDa | aa | bDb | bb Step 2 : Eliminate unit productions S  C is unit production S  AACD | ACD | AAC | CD | AC | aC | a A  aAb | ab C  aC | a D  aDa | aa | bDb | bb Step 3 : Eliminate useless symbols S is generating A, C, D are generating and all are reachable. No useless symbols. The reduced Grammar is S  AACD | ACD | AAC | CD | AC | a A  aAb | ab C  aC | a D  aDa | aa | bDb | bb After Elimination of Epsilon

2) Simplify the following grammar February,2023 S  aAa | bBb | BB A  C B  S | AC S  ε 7M Solution:- Step 1 : Eliminate ε productions S  aAa | bBb | bb | BB | B A  C B  S | AC Nullable variables are S by directly. B indirectly generates epsilon, so nullable. After Elimination of Epsilon Step 2 : Eliminate unit productions B  S & A  C is unit production S  aAa | bBb | bb | BB B  aAa | bBb | bb | BB | AC A  C ( no substitution for C) Step 3 : Eliminate useless symbols A & C is Non-generating so eliminate all A & C productions S  bBb | bb | BB B  bBb | bb | BB The reduced Grammar is

Normal forms A Grammar must be in some specified form or normalized form. we need to convert the grammar into normal form. Before going to convert the grammar into normal form, 1) First check whether the grammar is in reduced form or not. 2) If the Grammar is not in reduced form, first simplify the grammar and then convert it into normal forms. There are 2 types of normal forms. 1) Chomsky normal form (CNF) 2) Greiback Normal form (GNF). Define is Chomsky normal form ?  A context free grammar is in Chomsky normal form  when every rule is of the form A → BC and A → a, where a is a terminal, and A, B, and C are variables . Define Greiback normal form .  A context free grammar is said to be in GNF, if all of its productions are of the form A  a α where α is string of non-terminals and a is terminal.

Let G be a CFG for some L-{  } Definition: G is said to be in Chomsky Normal Form if all its productions are in one of the following two forms: A  BC where A,B,C are variables, or A  a where a is a terminal G has no useless symbols G has no unit productions G has no - productions Change non-CNF productions into CNF productions, i.e., A → BCD becomes A → BE, E → CD A → Fg becomes A → FG, G → g

Procedure To Convert CFG Into CNF step 1: check the given grammar is in simplified form, if not convert to simplified form. step 2: Convert the grammar into CNF we need to consider the bodies whose length is more than one, and for those we need to define a new non- terminals , which are not in our grammar and replace in the place of terminals. CNF : A → BC A → a where a is a terminal, and A, B, and C are variables For example: production S → aA can be decomposed as: S → RA   R → a    For Example: production rule X  XYZ can be decomposed as: X  PZ P  XY

1) Convert the CFG into CNF S  aSa | bSb | a | b | ε check the given grammar is in simplified form, if not convert to simplified form. Eliminate epsilon productions, Nullable variable { S } S  aSa | aa | bSb | bb | a | b There are no unit productions and useless symbols in the above grammar. Solution:- step 1: S  aSa | aa | bSb | bb | a | b step 2: A  a B  b New Productions for terminals S  ASA | AA | BSB | BB | a | b S  ASA S  BSB S  AX S  BY X  SA Y  SB S  AX | AA |BY | BB | a | b A  a B  b X  SA Y  SB New Productions for Non-terminals Finally CNF is New Productions for Non-terminals