Compiler design Project

861 views 13 slides Oct 19, 2020
Slide 1
Slide 1 of 13
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13

About This Presentation

Project Based on TOC


Slide Content

Compiler Design Project Topic – To convert BNF rules into YACC form and write code to generate abstract syntax tree Group 19 Dushyant Sharma Rohit Kumar Yogesh Sharma Raghulal Prajapati Rajesh Kumar

BNF Notation in Compiler Design BNF stands for  Backus Naur Form  notation. It is a formal method for describing the syntax of programing language which is understood as Backus Naur Formas introduced by John Bakus and Peter Naur in 1960. BNF and CFG (Context Free Grammar) were nearly identical. BNF may be a meta-language (a language that cannot describe another language) for primary languages . For human consumption, a proper notation for encoding grammars intended and called Backus Naur Form (BNF). Different languages have different description and rules but the general structure of BNF is given below –

= expansion The symbol ::= means “may expand into” and “may get replaced with.” In some texts, a reputation is additionally called a non-terminal symbol. Every name in Backus-Naur form is surrounded by angle brackets, < >, whether it appears on the left- or right-hand side of the rule. An expansion is an expression containing terminal symbols and non-terminal symbols, joined together by sequencing and selection. A terminal symbol may be a literal like (“+” or “function”) or a category of literals (like integer). Simply juxtaposing expressions indicates sequencing. A vertical bar | indicates choice

Introduction to YACC Last Updated: 06-05-2019 A parser generator is a program that takes as input a specification of a syntax, and produces as output a procedure for recognizing that language. Historically, they are also called compiler-compilers. YACC (yet another compiler-compiler) is an  LALR(1)  ( LookAhead , Left-to-right, Rightmost derivation producer with 1 lookahead token) parser generator. YACC was originally designed for being complemented by Lex .

Input File: YACC input file is divided in three parts Input File: Definition Part: The definition part includes information about the tokens used in the syntax definition:%token NUMBER %token ID Yacc automatically assigns numbers for tokens, but it can be overridden by%token NUMBER 621 Yacc also recognizes single characters as tokens. Therefore, assigned token numbers should no overlap ASCII codes. The definition part can include C code external to the definition of the parser and variable declarations, within  %{  and  %}  in the first column. It can also include the specification of the starting symbol in the grammar

Input File: Rule Part : The rules part contains grammar definition in a modified BNF form. Actions is C code in { } and can be embedded inside (Translation schemes).

Input File: Auxiliary Routines Part: The auxiliary routines part is only C code. It includes function definitions for every function needed in rules part. It can also contain the main() function definition if the parser is going to be run as a program. The main() function must call the function yyparse (). Input File: If yylex () is not defined in the auxiliary routines sections, then it should be included : # include " lex.yy.c " YACC input file generally finishes with : . y

Output Files The output of YACC is a file named  y.tab.c If it contains the  main()  definition, it must be compiled to be executable. Otherwise, the code can be an external function definition for the function  int yyparse () If called with the  –d  option in the command line, Yacc produces as output a header file  y.tab.h  with all its specific definition (particularly important are token definitions to be included, for example, in a Lex input file). If called with the  –v  option, Yacc produces as output a file  y.output  containing a textual description of the LALR(1) parsing table used by the parser. This is useful for tracking down how the parser solves conflicts

Syntax Directed Translation in Compiler Design Background :  Parser uses a CFG(Context-free- Grammer ) to validate the input string and produce output for next phase of the compiler. Output could be either a parse tree or abstract syntax tree. Now to interleave semantic analysis with syntax analysis phase of the compiler, we use Syntax Directed Translation. Definition Syntax Directed Translation are augmented rules to the grammar that facilitate semantic analysis. SDT involves passing information bottom-up and/or top-down the parse tree in form of attributes attached to the nodes. Syntax directed translation rules use 1) lexical values of nodes, 2) constants & 3) attributes associated to the non-terminals in their definitions.

The general approach to Syntax-Directed Translation is to construct a parse tree or syntax tree and compute the values of attributes at the nodes of the tree by visiting them in some order. In many cases, translation can be done during parsing without building an explicit tree Example E -> E+T | T T -> T*F | F F -> INTLIT

This is a grammar to syntactically validate an expression having additions and multiplications in it. Now, to carry out semantic analysis we will augment SDT rules to this grammar, in order to pass some information up the parse tree and check for semantic errors, if any. In this example we will focus on evaluation of the given expression, as we don’t have any semantic assertions to check in this very basic example. E -> E+T { E.val = E.val + T.val } E -> T { E.val = T.val } T -> T*F { T.val = T.val * F.val } T -> F { T.val = F.val } F -> INTLIT { F.val = INTLIT.lexval }

Let’s take a string to see how semantic analysis happens – S = 2+3*4. Parse tree corresponding to S would be To evaluate translation rules, we can employ one depth first search traversal on the parse tree. This is possible only because SDT rules don’t impose any specific order on evaluation until children attributes are computed before parents for a grammar having all synthesized attributes. Otherwise, we would have to figure out the best suited plan to traverse through the parse tree and evaluate all the attributes in one or more traversals

Above diagram shows how semantic analysis could happen. The flow of information happens bottom-up and all the children attributes are computed before parents, as discussed above. Right hand side nodes are sometimes annotated with subscript 1 to distinguish between children and parent.
Tags