CSE-3201-Lecture-04-Lexical Analysis-Regular Expression.pptx

mahammedcse 1 views 16 slides Oct 11, 2025

Slide 1 of 16

About This Presentation

Topic:
Lexical Analysis-Regular Expression.

Size: 236.84 KB

Language: en

Added: Oct 11, 2025

Slides: 16 pages

Slide Content

Lecture-04 Lexical Analysis ( Regular Expressions) Mostafiz Ahammed Lecturer Department of Computer Science and Engineering Notre Dame University Bangladesh

Language and Regular Expressions A Regular Expression is a Set of Rules / Techniques for Constructing Sequences of Symbols (Strings) From an Alphabet. Let  Be an A l ph a bet, r a Regular Expre s s i on Th e n L(r) is the Language that is Characterized by the Rules of r.

Rules for specifying Regular Expressions Regular expressions over alphabet   is a regular expression that denotes {  }. If a is a symbol (i.e., if a   ), then a is a regular expression that denotes {a}. Suppose r and s are regular expressions denoting the languages L(r) and L(s). Then (r) | (s) is a regular expression denoting L(r) U L(s). (r)(s) is a regular expression denoting L(r)L(s). (r) * is a regular expression denoting (L(r)) * . (r) is a regular expression denoting L(r).

How to “Parse” Regular Expressions Precedence: * has highest precedence. Concatenation as middle precedence. | has lowest precedence. Use parentheses to override these rules. Examples: – a b* = a (b*) If you want (a b)* you must use parentheses. a | b c = a | (b c) If you want (a | b) c you must use parentheses. Concatenation and | are associative. – (a b) c = a (b c) = a b c – (a | b) | c = a | (b | c) = a | b | c Example: – b d | e f * | g a = (b d) | (e (f *)) | (g a)

Example Let  = {a, b} The regular expression a | b denotes the set {a, b} The regular expression (a|b)(a|b) denotes {aa, ab, ba, bb} The regular expression a * denotes the set of all strings of zero or more a’s. i.e., {  , a, aa, aaa, ….. } The regular expression (a|b) * denotes the set containing zero or more instances of an a or b. The regular expression a|a*b denotes the set containing the string a and all strings consisting of zero or more a’s followed by one b.

Equality vs Equivalence Are these regular expressions equal? R = a a* (b | c) S = a* a (c | b) ... No! Yet, they describe the same language. L(R) = L(S) “Equivalence” of regular expressions If L(R) = L(S) then we say R  S “R is equivalent to S” From now on, we’ll just say R = S to mean R  S

Algebraic law of regular expressions

Regular Definition If Σ is an alphabet of basic symbols then a regular definition is a sequence of the following form: d 1  r 1 d 2  r 2 …….. d n  r n where Each d i is a new symbol such that d i  Σ and d i  d j where j < I Each r i is a regular expression over Σ  {d 1 ,d 2 ,…,d i-1 )

Regular Definition

Addition Notation / Shorthand

Unsigned Number 1240, 39.45, 6.33E15, or 1.578E-41 digit  0 | 1 | 2 | … | 9 digits  digit digit* optional_fraction  . digits |  optional_exponent  ( E ( + | -|  ) digits ) |  num  digits optional_fraction optional_exponent digit  0 | 1 | 2 | … | 9 digits  digit + optional_fraction  (. digits ) ? optional_exponent  ( E ( + | -) ? digits ) ? num  digits optional_fraction optional_exponent Shorthand

Nonregular sets

What Else Does Lexical Analyzer Do? Scan a w ay b l a n k s , new li n es, t a bs Can we Define Tokens For These? bl a nk tab  blank  tab newline  newline delim  blank | tab | newline ws  delim + Ans: No token is returned to parser

Overall Regular Token Attribute-Value Expression ws - - if if - then then - else else - id id pointer to table entry num num Exact value < relop LT <= relop LE = relop EQ < > relop NE > relop GT >= relop GE Note: Each token has a unique token identifier to define category of lexemes

Example of Regular Expression Regular Expression for numbers digit  0|1|…|9 digits  digit digit* optional_fraction  .digits|  optional_exponent  ( E (+|-|  ) digits ) |  num  digits optional_fraction optional_exponent Using shorthands: digit  0|1|…|9 digits  digit + optional_fraction  (.digits)? optional_exponent  ( E (+|-|  ) digits ) ? num  digits optional_fraction optional_exponent ? means “zero or one instance of” r? is a shorthand of r | 

CSE-3201-Lecture-04-Lexical Analysis-Regular Expression.pptx

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

CSE-3201-Lecture-04-Lexical Analysis-Regular Expression.pptx

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx