logo
    We consider one approach to using C++ to write a compiler in combination with the lexical analysis tool Lex and the parser generator tool YACC. This approach uses C++ classes and constructors to build a syntax tree during the parse. Synthesized and inherited attributes are easy to declare for individual terminals and nonterminals, and synthesized attributes can be computed during the parse as long as no inherited attributes are involved in the computation. We also consider the use of virtual functions for tree traversal. A complete example YACC specification is included which demonstrates these techniques. Results show that C++ offers several advantages over C for compiler design with YACC and Lex.
    Parse tree
    Lexical analysis
    Citations (0)
    Since BNF (context-free) grammars have been found so useful in describing the syntax of programming languages, a large number of parsing algorithms for context-free grammars have been developed for use in compilers and compiler-writing systems. A significant class of these algorithms have the following properties.
    Context-free language
    Dynamic compilation
    Compiler construction
    Citations (1)
    The availability of large, syntactically-bracketed corpora such as the Penn Tree Bank affords us the opportunity to automatically build or train broad-coverage grammars, and in particular to train probabilistic grammars. A number of recent parsing experiments have also indicated that grammars whose production probabilities are dependent on the context can be more effective than context-free grammars in selecting a correct parse. To make maximal use of context, we have automatically constructed, from the Penn Tree Bank version 2, a grammar in which the symbols S and NP are the only real nonterminals, and the other non-terminals or grammatical nodes are in effect embedded into the right-hand-sides of the S and NP rules. For example, one of the rules extracted from the tree bank would be S -> NP VBX JJ CC VBX NP [1] ( where NP is a non-terminal and the other symbols are terminals – part-of-speech tags of the Tree Bank). The most common structure in the Tree Bank associated with this expansion is (S NP (VP (VP VBX (ADJ JJ) CC (VP VBX NP)))) [2]. So if our parser uses rule [1] in parsing a sentence, it will generate structure [2] for the corresponding part of the sentence. Using 94% of the Penn Tree Bank for training, we extracted 32,296 distinct rules ( 23,386 for S, and 8,910 for NP). We also built a smaller version of the grammar based on higher frequency patterns for use as a back-up when the larger grammar is unable to produce a parse due to memory limitation. We applied this parser to 1,989 Wall Street Journal sentences (separate from the training set and with no limit on sentence length). Of the parsed sentences (1,899), the percentage of no-crossing sentences is 33.9%, and Parseval recall and precision are 73.43% and 72 .61%.
    Tree (set theory)
    Parse tree
    Parsing expression grammar
    Citations (116)
    Abstract We describe a program for the display and exploration of complex, domain‐specific information: ytracc, an interactive grammar debugging tool for compiler writers. The ytracc system provides the designer of a yacc grammar a method of tracing a parser as it uses the grammar, ytracc captures the states of the parse as it is carried out. The captured parse can then be replayed forwards or backwards, step‐by‐step, or subtree‐by‐subtree, as defined by the non‐terminals of the grammar. The tool has been successfully used by students as an assistant in an advanced undergraduate compiler construction class, and we use the tool in our everyday work.
    Compiler construction
    Parsing expression grammar
    Tracing
    Citations (3)
    Parsing expression grammar
    S-attributed grammar
    LR parser
    Indexed grammar
    We use evolutionary algorithms to speed up a rather complex process, the tree adjoining grammars parsing. This improvement is due due to a linear matching function which compares the fitness of different individuals. Internally, derived trees are processed as tree-to-string representations. Moreover, we present some practical results and a post running analysis that may encourage the use of evolutionary techniques in mildly context sensitive language parsing, for example.
    Tree (set theory)
    Parsing expression grammar
    S-attributed grammar
    Citations (2)