• Listen to a special audio message from Bill Roper to the Hive Workshop community (Bill is a former Vice President of Blizzard Entertainment, Producer, Designer, Musician, Voice Actor) 🔗Click here to hear his message!
  • Read Evilhog's interview with Gregory Alper, the original composer of the music for WarCraft: Orcs & Humans 🔗Click here to read the full interview.

Antlr4: How's my grammar? (functions + expressions + scalars + arrays)

Status
Not open for further replies.
Code:
grammar Grammar;

import Operation;

firstrule: (assignment | function)*;      //nothing

assignment: Identifier | array '=' expression;                                  //onEnter, if var == Identifier, set map, otherwise set double map?
                                                                                //visit expression with a visitor

expression: expression op=EXP<assoc=right> expression                           #OpExponent         //return exponent, visit expression
          | expression op=(MUL|DIV) expression                                  #OpMulDiv           //return mult or div, visit expression
          | expression '(' expression ')'                                       #OpExpressionMul1   //return mult, visit expression
          | function                                                            #LFunction          //visit function
          | expression Identifier                                               #OpExpressionMul2   //return mult, visit expression
          | expression op=(ADD|SUB) expression                                  #OpAddSub           //return
          | array                                                               #LArray             //visit array
          | Identifier                                                          #LIdentifier
          | Integer                                                             #LInteger
          | Float                                                               #LFloat
          | '(' expression ')'                                                  #OpExpression
          ;

array: Identifier '[' expression ']';                                           //read value given expression, validate that the expression is an integer

argumentList: expression (',' expression)*;

function: Identifier '(' argumentList ')';

COMMENT:                '//' ~[\r\n]* -> skip;
COMMENT_DELIMITED:      '/*' .* '*/' -> skip;
Integer:                '0' | ([1-9][0-9]*);
Float:                  ('0' | ([1-9][0-9]*)) ('.' [0-9]*)?;
Identifier:             [a-zA-Z_][a-zA-Z0-9_]*;
WS:                     [ \t\r\n] -> skip;

Code:
lexer grammar Operation;

EXP: '**';
MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';
 
Last edited:
From a fast look it appears to be ambiguous, but I'm not sure how your parser generator handles the rules, So I'm guessing it's ambiguous from a theorical aspect.

for instance, according to the rules, we don't know if 3 * 4 + 2 is build into:
*
3 +
4 2

or

+
* 2
3 4

This causes some shift-reduce problems when generating the parser. Fixing ambiguousity by hand is a bit of a pain sometimes and can really make your grammar look ugly, most parser libraries have ways to help solve ambiguousities by defining "precedence" or selection rules.
 
Level 29
Joined
Jul 29, 2007
Messages
5,174
  • array: Identifier '[' expression ']'; Should be array: '[' argumentList ']'; (What's the identifier even for in this context?)
  • It is ambiguous whether 0 is an integer or float.
  • [a-zA-Z0-9_] = [\w]
  • [ \t\r\n] = [\s]
  • I would imagine that putting all the constant expressions (function, Identifier, ...) before the ones that themselves begin in an expression would make more sense, and probably make it faster too, but that's just a guess.

By the way, why do you even want a distinction between integers and floats? a single Number class is nicer and more abstracted.
 
It's Antlr, which resolves ambiguities through precedence. It also places all lexer rules above parser rules.

0 would be an integer as integer is before float

[a-zA-Z0-9_] = [\w]
[ \t\r\n] = [\s]

dunno if that'll work, it's not full regex, but cool ;)

I would imagine that putting all the constant expressions (function, Identifier, ...) before the ones that themselves begin in an expression would make more sense, and probably make it faster too, but that's just a guess.

it won't ;p

By the way, why do you even want a distinction between integers and floats? a single Number class is nicer and more abstracted.

typechecking
 
Status
Not open for further replies.
Top