Performance considerations

The version 0.6.0 of the Hime Parser Generator introduced new improvements performance-wise in the lexing and parsing algorithms. The performances of the algorithms are summarized in the table hereafter.

Lexer only LR(k) RNGLR on unambiguous inputs RNGLR on highly ambiguous inputs
tokens / s 2,400,000 tokens / s 250,000 tokens / s 160,000 tokens / s
Mb of text / s 23 Mb / s 2.4 Mb / s 1.5 Mb / s
ms / Mb of text 42 ms / Mb 400 ms / Mb 640 ms / Mb

The benchmark used to obtain these results is included in the source code. The benchmark was run on a single core at 2.4 GHz. The lexer and parser used in this benchmark are generated from the grammar for the input language of the parser generator (except for the last column). The input consisted of a single file containing the definition of the C# grammar repeated 600 times, for a total size of 16,677,600 bytes. It contains 1,741,201 tokens. The results are a mean of 20 runs. The lexer, parser and input used are not biased because they are a good representation of typical languages with both very short tokens (one character) and long ones (comments).

Last edited May 14, 2013 at 2:20 PM by lwouters, version 1


No comments yet.