Null Reference Excetion Thrown when Trying to Create Parser Object

May 30, 2012 at 4:01 PM

As the title says, when I try to create the parser object, I get a null reference exception.

The code used is this:

Interpreter inter = new Interpreter();

ReligionLexer lexer = new ReligionLexer( "graphical_culture = muslimgfx" );
ReligionParser parser = new ReligionParser( lexer, inter );

SyntaxTreeNode root = parser.Analyse();

And the grammar file is this:

public cf text grammar Religion
{
    options    {
        Axiom = "idOption";
        Separator = "SEPARATOR";
    }
    terminals {
        INT -> [0-9]*;
        FLOAT -> INT? '.' INT;
        NUMBER -> INT|FLOAT;
        BOOL -> 'yes'|'no';
       
        ASSIGN -> '=';
        OPEN -> '{';
        CLOSE -> '}';
       
        LETTER -> [a-z A-Z 0x00C0 .. 0x01AF];
        ID -> (LETTER|'_') (LETTER|[0-9]|'_');
       
        COMMENT -> '#' . ('\r'|'\n');
       
        WHITE_SPACE -> ' ' | '\t' | '\r' | '\n';
        SEPARATOR -> WHITE_SPACE+;
    }
    rules{
        idOption -> ID ASSIGN ID {doIdOption};
    }   
}

Coordinator
May 30, 2012 at 5:03 PM

Hi,

Thank you for your input. I have indeed been able to replicate your issue. You stumbled upon a known bug that has been resolved in the code based, although no release at this time includes the fix. In order avoid your problem you may at your leisure either:

-          Download the sources from codeplex and recompile the solution. From this you may replace in the distribution you downloaded the new Hime.Redit.dll, Hime.CentralDogma.dll and himecc.exe file with the new version you just compiled. Note that in the latest version the syntax for expressing grammar changed a bit, you will need to replace the ‘public cf text grammar’ by ‘cf grammar’.

-          Or, choose another parsing method. I understand that you used the himecc without specifying your preferred parsing method. By default the tool uses the RNGLALR1 method which causes your problem. In order to avoid this, please use the following command line:

himecc Religion.gram –m LALR1

The LALR(1) parsing method is less powerful than RNGLR. However if your grammar is simple enough it will suffice. If you absolutely need to use the RNGLR parsing method because your grammar contains conflicts you cannot resolve, you will need to fall back to the first mentioned solution to your problem. Using the above mentioned command line, I have been able to avoid your issue.

In addition, to your problem I have made some corrections to the grammar you provided:

public cf text grammar Religion
{
    options    {
        Axiom = "idOption";
        Separator = "SEPARATOR";
    }
    terminals {
        INT -> [0-9]*;
        FLOAT -> INT? '.' INT;
        NUMBER -> INT|FLOAT;
        BOOL -> 'yes'|'no';
      
        ASSIGN -> '=';
        OPEN -> '{';
        CLOSE -> '}';
      
        // character span in character classes (a-z) must not be separated by spaces. In which case it will mean the space character is included in the character class.
        // Unicode character span cannot be included in the character class syntax.
        // Your intended expression shall be written as follow:
        LETTER -> [a-zA-Z] | 0x00C0 .. 0x01AF ;
        // Added the * so that IDs can be of 1 character and more
        // removing the * would mean all IDs are 2 characters long
        ID -> (LETTER|'_') (LETTER|[0-9]|'_')* ;
      
        // Fixed your comment definition with known best practice :)
        NEW_LINE -> 0x000D /* CR */
                        |  0x000A /* LF */
                        |  0x000D 0x000A /* CR LF */
                        |  0x2028 /* LS */
                        |  0x2029 /* PS */ ;
        // this regular expression means a comment begins with # and ends at the line break and everything in-between is in the commend
        // but the content of the comment itself cannot contain a line break
        // here we use the - operator to signify the exclusion
        COMMENT -> '#' (.* - (.* NEW_LINE .*)) NEW_LINE ;
       
       
        WHITE_SPACE -> ' ' | '\t' | '\r' | '\n';
        SEPARATOR -> WHITE_SPACE+;
    }
    rules{
        idOption -> ID ASSIGN ID {doIdOption};
    }  
}

Hope this helps.

Best,

Laurent

May 30, 2012 at 6:02 PM

Thank you, that's sorted the problem out. And thanks for fixing the grammar errors too.