How to use?

Feb 22, 2014 at 9:13 AM
I tried to write a simple application for 0.6.1 using the code in the tutorials. I can't seem to get it to work. As soon as it gets to parser.Parse(), I get an error "Object reference not set to an instance of an object.: A first chance exception of type 'System.NullReferenceException' occurred in Hime.Redist.dll". Is something missing from the zip file?

Perhaps a more straightforward demo could be included in the source code.

I have a much more complex application working that was for 0.5.0, so I guess I'll stick with that for now.
Coordinator
Feb 22, 2014 at 10:04 AM
Hi drewkeller,

Thanks for your input. Some breaking changes have been introduced in the 0.6.1 version in order to improve the overall performances. Generated parsers are split into 4 files (2 .cs and 2 .bin files). bin files contain the binary representation of the automata and need to be added as embedded resources. This may be one of the issue you encoutered. I could be more specific if you provide the stack trace of the exception.

Anyway, if you are happy with the version you use (0.5.0), you can safely keep it that way; you won't gain much on the features' front by updating.
FYI, we am working on the future 1.0 version which I hope to release in April. This will also break APIs with the current version, so I would recommend to wait for the update. The API will be stable from this version on and provide major performance improvements.

Laurent
Feb 23, 2014 at 7:27 AM
Cool, thanks for the update!

I had actually made the various code changes already, so it wasn't too difficult to shoehorn the new version back in. I added the .bin files as embedded resources and it works now.

The only issue I'm having with the new version is determining how many arguments to pop off the stack, when parsing something like " Sum(1,2,3,4)", where the number of arguments is arbitrary. In the old version, I could calculate it from SubRoot.Children.Count. How would I determine that now?
   // Parse something like Sum(1,2,3,4), where the number of arguments is arbitrary.
   function -> op_conditional
      | VARIABLE '('! op_conditional (',' op_conditional)* ')'! { OnFunction } ;
Coordinator
Feb 23, 2014 at 10:54 AM
Hi, the mechanism for the semantic actions is slightly different. You can find the relevant relevant tutorial in the documentation. Essentially, you will have to implement a method with the signature
void OnFunction(Variable head, Symbol[] body, int length);
The first parameter will be the 'function' variable in your case, the 'body' parameter will contain all the children and quite transparently, the number of children is given in the last parameter :)
Once this is done, you will have to create an instance of the Actions class (nested class of the generated parser class), set the appropriate member and pass this instance to the parser (constructor).
The tutorial covers all these steps. However, should you have any problem, feel free to report.
Also, I recognise that the process is quite heavy; so should you have any proposal as to how you would prefer to setup semantic actions, don't hesitate.

Laurent
Feb 23, 2014 at 2:06 PM
I read through the tutorials. Maybe I missed something. I have a few tests set up with different expressions. Below is what I get. The parameters supplied to OnFunction always contain the same thing, but the stack changes. The stack might contain other stuff from previous operations, so I can't simply use the length of the stack.

I tried putting the semantic trigger '^' after the last ')' in my grammar, but it didn't seem to change anything.
public void OnFunction(Variable head, Symbol[] body, int length)
{
  string func = (body[0] as TextToken).Value.ToLower();
  int argCount = // ????
  ...(processing)...
}
Test 1
Expression: Sum(1,2)
Stack: 2,1
body: VARIABLE, '(', op_conditional, _v1E, ')', null, null, null.... (array contains 20 elements, the remaining ones are null)
length: 5

Test 2
Expression: Sum(1,2,3)
Stack: 3,2,1
body: VARIABLE, '(', op_conditional, _v1E, ')', null, null, null.... (array contains 20 elements, the remaining ones are null)
length: 5

Test 2
Expression: Sum(1,2,3,4)
Stack: 4,3,2,1
body: VARIABLE, '(', op_conditional, _v1E, ')', null, null, null.... (array contains 20 elements, the remaining ones are null)
length: 5
Coordinator
Feb 24, 2014 at 6:45 AM
There is indeed something wrong with the behavior of the semantic actions. What is happening is that the semantic actions are called before some of the AST manipulations have taken place. In particular, the strange '_v1E' you see is an internal grammar variable that is generated for the implementation of the * and + operators in your grammar. You should not have to see them. I also reckon that because the semantic actions only pass the symbols on the body parameters and not the AST nodes you will not be able workaround the issue.

The AST build process being different in the upcoming version the issue has been fixed in it. I'll see what I can do for the current version. A bug has been opened: Issue 497.

Once again, I would encourage you to stay with the 0.5.0 version if it works for you for the time being and wait for the upcoming 1.0.0 version in a couple months. This version aims at stability and LTS.

Laurent
Feb 24, 2014 at 6:10 PM
Thanks for looking into it.

In the old version, I drill down into the variable tree items to determine how many arguments were passed. I think the items added from the optional regex grouping are added as children of the preceding child item instead of to the parent item (which would make them siblings). I don't know whether that would be considered a bug or not, but I guess it doesn't really matter.
        public void OnFunction(SyntaxTreeNode SubRoot)
        {
            var node = (SyntaxTreeNode)SubRoot.Children[0];
            string func = ((SymbolTokenText)node.Symbol).ValueText.ToLower();
            int argCount = (SubRoot.Children.Count - 3);
            SubRoot = SubRoot.Children[3];
            bool found = true;
            // drill down the tree, looking for symbols that contain variables
            while (SubRoot.Children.Count > 2 && found)
            {
                found = false;
                foreach (SyntaxTreeNode sub in SubRoot.Children)
                {
                    var symbol = sub.Symbol as SymbolVariable;
                    if (symbol == null) continue;
                    if (symbol.Name.StartsWith("_v"))
                    {
                        argCount++;
                        SubRoot = sub;
                        found = true;
                    }
                }
            }
            ...processing....
I like how the newer version is cleaner within the actions. However, I like being able to simply inherit/implement the actions class in the older version instead of having to add all of them, which is kind of a maintenance pain whenever I add a new action.

0.5
        public void OnVariable(SyntaxTreeNode SubRoot)
        {
            var node = (SyntaxTreeNode)SubRoot.Children[0];
            string name = ((SymbolTokenText)node.Symbol).ValueText;
            ...
        }
0.6
        public void OnVariable(Variable head, Symbol[] body, int length)
        {
            string name = (body[0] as TextToken).Value;
            ...
        }
Coordinator
Feb 25, 2014 at 6:37 AM
Thanks for the input. Then I suppose a good middle ground regarding the actions would be to have an Actions class nested in the Parser class:
class Actions
{
    public virtual void OnVariable(Variable head, Symbol[] body, int length) { }
    public virtual void OnX(Variable head, Symbol[] body, int length) { }
    public virtual void OnY(Variable head, Symbol[] body, int length) { }
    // etc.
}
This would keep the advantage of the current signature for the actions and still remove some of the visible clutter with the delegates.
You can simply extend the Actions class and override the methods that are relevant.
This has the added bonus that in the case of regenerating the parser with new actions this won't break existing code.

What do you think of this solution?

Also this is definitely a bug that you see the auto-generated variables (prefixed with _v). The tree should be flatten before the action is called so that the content of the "body" parameter reflects the grammar rule that has been written. Also the point of the signature of the action methods in version 0.6.1 is to only pass the symbols on the AST and not the nodes themselves because it is unsafe to manipulate them before the AST is completely built.

On the same subject, the same signature is used for the actions in the upcoming version 1.0.0, although the AST is essentially made de-facto immutable so that it would be safe to pass the nodes in the actions.
Feb 25, 2014 at 12:44 PM
Thanks for the input. Then I suppose a good middle ground regarding the actions would be to have an Actions class nested in the Parser class: {...}

This would keep the advantage of the current signature for the actions and still remove some of the visible clutter with the delegates.
You can simply extend the Actions class and override the methods that are relevant.
This has the added bonus that in the case of regenerating the parser with new actions this won't break existing code.

What do you think of this solution?
Sounds good.