Multi-line comment block recognizing

Oct 14, 2014 at 11:14 AM
Hi, everybody!
How I must create rule for multiline comment recognizing? Comment begin with ( and end with )
I tried the follow grammar:
terminals 
    {
        COMMENT      -> '(*' .* '*)' ;
    }
this works fine when source text has one comment block only. But when source text has an several comment blocks, then recognized terminal contains whole symbols from the first matching '(' to the last matching ')'
Coordinator
Oct 14, 2014 at 11:52 AM
Hello,

To do this you can use the language difference operator, noted '-'. So for your case:
COMMENT -> '(*' (.* - (.* '*/' .*)) '*)' ;
This rule will match any string beginning with (, ending with ) and a string or zero or more characters inside. Except that the inner string cannot contain the *) sequence.
In practice, this will solve your issue because this is the formal definition of the informal 'the comment ends at the first *) occurence'.

As an additional note, this project has been moved to bitbucket at: https://bitbucket.org/laurentw/hime

I hope this helps!

Laurent
Oct 14, 2014 at 6:27 PM
Thanks, Laurent! It's perfect, but it has a small mistake, right notation:
COMMENT -> '(*' (.* - (.* '*)' .*)) '*)' ;
Also, this similar solution I found in your extras\Grammars\
Thanks