16 people following this project (follow)

Project Description

Hime Parser Generator generates lexers and parsers in C# 2.0 for given grammars. Hime supports a wide range of parsing methods: LR(0), LR(1), LALR(1), RNGLR(1), RNGLALR(1) and LR(*).
The tool itself is developed in C#. Lexers and parsers can be generated either by the command line tool, or programmatically by using the library.

Features

  • Grammar inheritance (grammars can inherit rules, symbols, etc. from other grammars).
  • Can generate documentation for the grammars (HTML pages for the grammar and the automata, visuals for the automata).
  • Lexer: Enhanced regular expression language
    • always generates a minimal DFA for maximum speed
    • use usual operators: ?, *, +, |, as well as {n} to specify a specific count of matches and {m, n} to specify a range of matches count.
    • easily exclude expressions with the language difference operator: exp1 - exp2 will only match exp1 and exclude all matches of exp2
    • easily specify non-printable characters using unicode values (0x1234) or unicode ranges: 0x1234 .. 0x5678
    • use unicode character blocks: \ub{IsGreek}
    • use unicode character classes: \uc{Lu}
  • Parser: Enhanced context-free grammar language
    • use regexp-like operators: *, +, ?, |
    • write template rules: head<param> -> TERMINAL param variable<param> ;
      • you may specify multiple parameters
      • parameters can be used directly in the rule definition or as values for the instantiation of other templates

Grammar Debugging

The Hime Parser Generator outputs a log as an MHTML file for the grammar compilation process (-l option in the command line). This log will report grammar conflicts that are still unresolved. The log gives useful information regarding the conflicts, helping you solve them.

log_conflict_small.png

As shown in the picture above, for each conflict the log will report:
  • The state where the conflict is occuring, so that you can look it up in the generated grammar documentation (-d option in the command line)
  • The conflictuous lookahead
  • The list of conflictuous items
  • Examples of input which are ambiguous according to this conflict

Grammar documentation

The Hime Parser Generator outputs a documentation as a MHTML file for the grammar that is being compiler (use the -d option in the command line). This documentation contains:
  • The complete set of "flat" grammar rules with syntax highlightings.
  • The generated LR graph in the form of a network of hyperlinked web pages containing for each state a detailed description of the state's items.

The picture hereafter shows an example of what is obtained with the grammar of the ANSI C language:

doc_grammar.png




Hereafter is shown an example of a generated web page for a given state in the LR graph:
As can be seen in the picture, each item of the state is detailed with:
  • An icon indicating whether this item is part of a conflict.
  • The action (SHIFT or REDUCE) for the item. An hyperlink to the next state is given in the case of SHIFT actions.
  • The item's grammar rule with the dot position.
  • The item's lookaheads

doc_state.png

Last edited Jul 14 2011 at 8:27 AM by lwouters, version 11