Grammar fails to create generator

Oct 25, 2011 at 12:09 AM

Greetings,

I'm relatively new to creating grammars and parsers, though I've worked existing ones before.

I have the following grammar:

public cf text grammar Recipe
{
  options {
    Axiom = "recipe";
    Separator = "SEPARATOR";
  }
  terminals {
    INTEGER    -> [1-9] [0-9]* | '0' ;
    REAL    -> INTEGER? '.' INTEGER  (('e' | 'E') ('+' | '-')? INTEGER)?
            |  INTEGER ('e' | 'E') ('+' | '-')? INTEGER ;
    FRACTION  -> INTEGER ('/' | 0x2044) INTEGER | 0x00BC | 0x00BD | 0x00BE | 0x2153 | 0x2154 | 0x215B | 0x215C | 0x215D | 0x215E;
    NUMBER    -> INTEGER | REAL | FRACTION;
    WHITESPACE  -> 0x0020 | 0x0009 | 0x000B | 0x000C;
    SEPARATOR  -> WHITESPACE+ | ',';
    MEASURE    -> 'cup' | 'cups' | 'teaspoon';
    PREP    -> 'sliced' | 'cubed' | 'drained';
    NOTE    -> '(' [a-zA-Z]+ ')';
    INGREDIENT  -> ([a-zA-Z]+);    
    CR      -> 0x000D;
    LF      -> 0x000A;
  }
  rules {
    fooditem  -> NUMBER SEPARATOR MEASURE? SEPARATOR INGREDIENT SEPARATOR PREP? SEPARATOR NOTE?;
    foodlist  -> (fooditem CR? LF?)+;
    recipe     -> foodlist;
  }
} 

The log displays the following errors:

ERROR: Grammar: In state A expected terminal MEASURE cannot be produced by the lexer. Check the regular expressions.
ERROR: Grammar: In state 13 expected terminal PREP cannot be produced by the lexer. Check the regular expressions.
ERROR: Grammar: In state 17 expected terminal PREP cannot be produced by the lexer. Check the regular expressions.

Why is this? Is it due to the ambiguity introduced by the INGREDIENT terminal? I'm trying to generate a parser which handles ambiguities like this. How can I do this without the generator becoming unable to produce terminal state?

Thanks for any insights into this...

Coordinator
Oct 25, 2011 at 7:28 AM

Hello,

Thank you for your input.

The problem you are experiencing comes from the fact that when writing the lexical rules (regular expressions), multiple rules may match the same input. In your example the definition of INGREDIENT is a superset of MEASURE and PREP. This means that any input matched by MEASURE or PREP can also be matched by INGREDIENT.

To resolve these ambiguities, the tool uses a prioritization system. The later a lexical rule is defined, the more priority is has. So in your example, the INGREDIENT rule has more priority than MEASURE and PREP. Consequently, MEASURE and PREP can never be produced because the INGREDIENT will always override them.

To fix your issue, you have to move the MEASURE and PREP rules after the INGREDIENT rule.

I hope this helps you.

Kindly,

Laurent

Oct 25, 2011 at 11:29 AM

Thanks for the insight, Laurent.

At first I thought you were telling me that I had to rewrite the `fooditem` rule, but then I realized that you are talking about the order in which the terminals appear in the `terminals` section. I did, in fact, reorder the terminal rules and now it works. So, thanks!