Sudan University of Science and Technology College of Graduate Studies College of Computer Science and Information Technology
PhD Program in Computer Science Seminar Title: YACC (Yet Another Compiler Compiler ) Prepared By: Omer Salih Dawood Omer Email:[email protected] Supervised By: Dr. Mohamed Elhafiz Mustafa Musa 2
Agenda: Overview and History What Do Compilers Do? The Structure of a Compiler Review of Parser How Does YACC Work? References. 3
Overview and History The first real compiler FORTRAN compilers of the late 1950s 18 person-years to build Today, we can build a simple compiler in a few month. Crafting and efficient and reliable compiler is still challenging.
Another kind of language processor, called an interpreter , differs from a compiler in that it executes programs without explicitly performing a translation Advantages and Disadvantages of an interpreter What Do Compilers Do? Source Program Encoding Output Interpreter Data
What Do Compilers Do? (Cont ’ d.) Advantage Modification to program during execution Interactive debugging Not for every language, e.g., Basic, Pascal Dynamic-typed languages Variable types may change at run time, e.g., LISP. Difficult to compile Better diagnostics Source code is available. Machine independence However, the interpreter itself must be portable.
What Do Compilers Do? (Cont ’ d.) Disadvantage Slower execution due to repeated examination Substantial space overhead
The Structure of a Compiler Modern compilers are syntax-directed Compilation is driven the syntactic structure of programs; i.e., actions are associated with the structures. Any compiler must perform two major tasks Analysis of the source program Synthesis of a machine-language program
The Structure of a Compiler Scanner Parser Semantic Routines Code Generator Optimizer Source Program (Character Stream) Tokens Syntactic Structure Intermediate Representation Target Machine Code Symbol and Attribute Tables (Used by all Phases of The Compiler) The structure of a Syntax-Directed Compiler
The Structure of a Compiler (Cont ’ d.) Scanner The scanner begins the analysis of the source program by reading the input, character by character, and grouping characters into individual words and symbols ( tokens ) The tokens are encoded and then are fetched to the parser for syntactic analysis For details. Scanner generators regular exp for tokens finite automata as programs lex or scangen
The Structure of a Compiler (Cont ’ d.) Parser Given a formal syntax specification (typically as a context-free grammar [CFG] ), the parse reads tokens and groups them into units as specified by the productions of the CFG being used. While parsing, the parser verifies correct syntax, and if a syntax error is found, it issues a suitable diagnostic . As syntactic structure is recognized, the parser either calls corresponding semantic routines directly or builds a syntax tree . grammar yacc or llgen parser
The Structure of a Compiler (Cont ’ d.) Semantic Routines Perform two functions Check the static semantics of each construct Do the actual translation for generating IR The heart of a compiler Optimizer The IR code generated by the semantic routines is analyzed and transformed into functionally equivalent but improved IR code. This phase can be very complex and slow.
The Structure of a Compiler (Cont ’ d.) One-pass compiler No optimization is required Retargetable compiler Many machine description files, e.g., gcc Compiler writing tools Compiler generators or compiler-compilers Lex and Yacc E.g., scanner and parser generators[1]
Review of Parser Yacc provides a general tool for imposing structure on the input to a computer program. The Yacc user prepares a specification of the input process; this includes rules describing the input structure, code to be invoked when these rules are recognized, and a low-level routine to do the basic input. Yacc then generates a function to control the input process[2]. Parser invokes scanner for tokens . Parser analyze the syntactic structure according to grammars . Finally , parser executes the semantic routines[3].
15 yacc Generate a new parser code from grammar YACC source (*.y) y.tab.h y.tab.c C compiler/linker Compile a new parser y.tab.c a.out a.out Parse source code Token stream Abstract Syntax Tree y.output How Does YACC Work?
16 YACC Format %{ C declarations %} yacc declarations %% Grammar rules %% Additional C code
17 YACC Example C declarations yacc declarations Grammar rules Additional C code
18 YACC Declaration Summary `%start' Specify the grammar's start symbol `%union' Declare the collection of data types that semantic values may have `%token' Declare a terminal symbol (token type name) with no precedence or associativity specified `%type' Declare the type of semantic values for a nonterminal symbol
19 YACC Declaration Summary `%right' Declare a terminal symbol (token type name) that is right-associative `%left' Declare a terminal symbol (token type name) that is left-associative `% nonassoc ' Declare a terminal symbol (token type name) that is nonassociative (using it in a way that would be associative is a syntax error)
20 YACC and Bison Command Yacc (AT&T) yacc –d xxx.y Bison (GNU) bison –d –y xxx.y Produce y.tab.c, the same as above yacc command
21 Work between LEX and YACC %{ #include < stdio.h > #include " y.tab.h " %} id [_a- zA -Z][_a-zA-Z0-9]* %% int { return INT; } char { return CHAR; } float { return FLOAT; } {id} { return ID;} %{ #include < stdio.h > #include < stdlib.h > %} %token CHAR, FLOAT, ID, INT %% yacc -d xxx.y Produced y.tab.h # define CHAR 258 # define FLOAT 259 # define ID 260 # define INT 261 parser.y scanner.l
References: 1- Book: Crafting a Compiler with C Author: Charles N. Fischer and Richard J. LeBlanc, Jr.The Benjamin/Cumming Publishing Company, Inc 2- dinosaur.compilertools.net/ yacc / 3- Ying-Hung Jiang “Introduction to Yacc “ 4- “YACC Parser Generator” ,www.csie.ntu.edu.tw/~b92006/YACC-present-2005.ppt 22