NATIONAL CAPODISTRIAN UNIVERSITY OF ATHENS F ACULTY OF SCIENCE DEPARTMENT OF INFORMATICS |
Development and Exploration Tool for Context-Free Grammars created with Java |
Greek Introduction - Ελληνική Εισαγωγή
Introductory Concepts
The grammars which are handled by this tool follow the BNF standarization. The only difference is that the symbolism used by the YACC meta-compiler has been preferred not only for reasons of simplicity but, also in order to keep the tool compatible with some future extensions which are currently under consideration. This decision is also due to the majot approval YACC has been enjoying from the Informatics community.
Grammar symbols are divided into terminals and nonterminals:
The productions of a grammar specify the way in which the terminals and nonterminals can be combined to form strings. Each production consists of a nonterminal, followed by a colon ":", followed by a string of nonterminals and terminals. Productions with the same lefthand part are grouped into a rule which has the common lefthand part and as right hand part the righthand parts of the productions divided by "|".
The following grammar that describes numerical expressions is an example:
E : E + T | T
In this grammar, the rules are the three lines of the grammar and the productions are the following:
E : E + T
In this example the set of the terminals is: {+, *, (, ), id}, while the nonterminals are : {E, T, F}. The start symbols is E (the left hand part of the first rule).
Although the dominative trend is that of writing the terminals of a language with small latin letters and the nonterminals with capitals, the only convention respected by this tool is that terminals start with a latin letter.
The order of the productions within a rule is insignificant. In the contrary, the order of the rules is important, not only in the process of the initial symbol recognition but also in order to recognize and eliminate left recursions, as we will see next.
There is a principle that symbols appearing at the left part of a rule are acceptable only if they have previously appeared in the right hand part of a preceding rule.
For practical reasons, the division of symbols with at least one space is imposed in order to enable the tool to distinguish the symbols of the grammar. On the other hand, productions of a grammar will be accepted for further exploration only if they are grouped in rules. The empty string is denoted with null.
If these principles are taken into account, the tool can be used to recognize the symbols of a grammar.
Grammar Symbols Recognition-Syntax Tree-Left Productions
Grammatical Properties and Transformations
A grammar is left recursive if it has a nonterminal A such that there is a derivation of the string A a through one (immediate left recursion) or multiple (non immediate left recursion) production of the grammar. If a is the empty string, then the recursion is a cycle. In case of a recursive grammar without cycles, the recursions can be eliminated.
Left factoring is a grammar transformation that is useful for producing a grammar suitable for predictive parsing.
The elimination of the recursions together with left factoring are needed in order to proceed to the next stages of syntax analysis.
Enhancements of recognition for such properties along with exploration of the necessary transformations are provided by the tool.
Grammar Properties & Transformations
Nonrecursive Predictive Parser Operation
Parsing Table Construction
Daveloping & Exploring Grammars
Vassilis Apostolopoulos - Thanos Varias - Nikolaos Theocharis
November 1998
The source of the above data is the book "Compilers: Principals Techniques and Tools"
by Aho, Sethi και Ullman, also known as "dragonbook".