- Joined
- Jul 10, 2007
- Messages
- 6,306
So, I decided to touch up on my old compiler again and work on the preprocessor I'm using to build it since Antlr doesn't let you import and stuff when you use modes. It's missing some other things too, like #if.
Anyways, I defined the syntax so that it could be used with most languages. This is because the language that Antlr is used with will be unknown.
You can import files (treat them like macros I guess)
#<import("filename" args...)>
You can read arguments and defined variables
#<$var>
I do want to do string interpolation, but it would really wreck things across languages. If you couldn't use a $ in your strings all of a sudden or something, haha, I dunno what people would think. I guess can always just require everyone to escape. It'll only be for the grammar files anyways. String interpolation is not yet implemented.
"$hello"
Can run through the interpreter, which allows you to assign and declare variables among other things. Evaluate is not yet implemented, but is trivial.
#<evaluate expression>
The interpreter supports the following operations and allows strings, doubles, integers, booleans, and variables. The operations are valid on all types. Some of the operations on some of the types are pretty funky.
+ - / * %
! < <= > >= == != && ||
=
( )
All expressions run through the interpreter.
You can do if-statements. These if-statements allow certain pieces of code to be emitted.
#<if (expression)>
#<end>
You can define arguments for a file. When you import the file, you can pass values to these arguments.
#<arguments>
#<end>
You can define a package. It's essentially the same thing as an external file, just inside of your current file. Packages are faster to use. They are essentially macros. Packages are not yet implemented.
And then using a package
#<implement packageName(args...)>
No need to do functions and what not, that's what Java is for : p. Loops may be useful. I guess packages and files would be preprocessor functions.
I guess I'll go over some of the strange operations for strings now.
String subtraction will remove the one string from the other string.
"helloelel" - "el" = "hlo"
String multiplication will do the cross of two strings as two sets or will multiply a string a number of times. Depends on whether a number is given or not.
"hello" * "rawr" = "hrawrerawrlrawrlrawrorawr"
"hello" * 4 = "hellohellohellohello"
String division will get the counted intersection between the two strings. In the below, there is only one h in the second string. This is legit set division.
"hhello" / "meh" = "he"
The especially strange one is string modulo. The best way I could define this was to find the disjoint set (counted, like in division) between the two strings.
"hello" % "h46"= "ello46"
In cases where a number is on the left side, it will do its very best to work with that number if the other side is a string.
5 + "5" = 10
But if you do something like the following, it'll resort to string.
5 + "f" = "5f"
Only decimal is supported for the base at the moment, but I may add binary, hexadecimal, and octal later on.
The interpreter has both a lexer and a parser. The preprocessor only has a lexer. The preprocessor is also context-sensitive. I'll eventually add some scope to it too, but to do that I'd have to change around the symbol table inheritance.
Here is the code for the interpreter. The grammar itself is very small.
Here is the code for the preprocessor.
If anyone ever wants to start working with Antlr4 and build their own compilers in the future, I really recommend you use this stuff : P. It'll make your life way easier.
I have the .jar if anyone wants me to attach it.
Also, I figure I'll put up my little test grammar thing. This one's a bit better than the default Antlr4 test harness. Running this thing without any arguments will tell you how to use it.
This is an example. Note here that -lexer Antolr expects a file called AntlrLexer.
grammar/input.g4 -channel Channel -lexer Antlr -package compile.antlr -tokens
edit
Ah, forgot this tiny snippet.
Example of use using test harness
Output with tokens (I still have a few debug messages in there)
Anyways, I defined the syntax so that it could be used with most languages. This is because the language that Antlr is used with will be unknown.
You can import files (treat them like macros I guess)
#<import("filename" args...)>
You can read arguments and defined variables
#<$var>
I do want to do string interpolation, but it would really wreck things across languages. If you couldn't use a $ in your strings all of a sudden or something, haha, I dunno what people would think. I guess can always just require everyone to escape. It'll only be for the grammar files anyways. String interpolation is not yet implemented.
"$hello"
Can run through the interpreter, which allows you to assign and declare variables among other things. Evaluate is not yet implemented, but is trivial.
#<evaluate expression>
The interpreter supports the following operations and allows strings, doubles, integers, booleans, and variables. The operations are valid on all types. Some of the operations on some of the types are pretty funky.
+ - / * %
! < <= > >= == != && ||
=
( )
All expressions run through the interpreter.
You can do if-statements. These if-statements allow certain pieces of code to be emitted.
#<if (expression)>
#<end>
You can define arguments for a file. When you import the file, you can pass values to these arguments.
#<arguments>
#<end>
You can define a package. It's essentially the same thing as an external file, just inside of your current file. Packages are faster to use. They are essentially macros. Packages are not yet implemented.
Code:
#<package name>
<arguments>
<end>
#<end>
And then using a package
#<implement packageName(args...)>
No need to do functions and what not, that's what Java is for : p. Loops may be useful. I guess packages and files would be preprocessor functions.
I guess I'll go over some of the strange operations for strings now.
String subtraction will remove the one string from the other string.
"helloelel" - "el" = "hlo"
String multiplication will do the cross of two strings as two sets or will multiply a string a number of times. Depends on whether a number is given or not.
"hello" * "rawr" = "hrawrerawrlrawrlrawrorawr"
"hello" * 4 = "hellohellohellohello"
String division will get the counted intersection between the two strings. In the below, there is only one h in the second string. This is legit set division.
"hhello" / "meh" = "he"
The especially strange one is string modulo. The best way I could define this was to find the disjoint set (counted, like in division) between the two strings.
"hello" % "h46"= "ello46"
In cases where a number is on the left side, it will do its very best to work with that number if the other side is a string.
5 + "5" = 10
But if you do something like the following, it'll resort to string.
5 + "f" = "5f"
Only decimal is supported for the base at the moment, but I may add binary, hexadecimal, and octal later on.
The interpreter has both a lexer and a parser. The preprocessor only has a lexer. The preprocessor is also context-sensitive. I'll eventually add some scope to it too, but to do that I'd have to change around the symbol table inheritance.
Here is the code for the interpreter. The grammar itself is very small.
Java:
grammar Expr;
@parser::members
{
AntlrLexer.Environment environment = null;
public TokenStreamRewriter rewriter = null;
public static class Value
{
public final ValueType type;
public final int integerValue;
public final double doubleValue;
public final String stringValue;
public final boolean booleanValue;
public Value()
{
type = ValueType.INVALID;
integerValue = 0;
doubleValue = 0;
stringValue = null;
booleanValue = false;
}
public Value(int value)
{
type = ValueType.INTEGER;
integerValue = value;
doubleValue = 0;
stringValue = null;
booleanValue = false;
}
public Value(double value)
{
type = ValueType.DOUBLE;
doubleValue = value;
integerValue = 0;
stringValue = null;
booleanValue = false;
}
public Value(String value)
{
type = ValueType.STRING;
stringValue = value;
integerValue = 0;
doubleValue = 0;
booleanValue = false;
}
public Value(boolean value)
{
type = ValueType.BOOLEAN;
booleanValue = value;
integerValue = 0;
doubleValue = 0;
stringValue = null;
}
public boolean getBoolean()
{
switch (type)
{
case BOOLEAN:
return booleanValue;
case INTEGER:
return integerValue != 0;
case DOUBLE:
return doubleValue != 0;
case STRING:
return Boolean.parseBoolean(stringValue);
default:
return false;
}
}
public int getInteger()
{
switch (type)
{
case BOOLEAN:
return booleanValue? 1 : 0;
case INTEGER:
return integerValue;
case DOUBLE:
return (int)doubleValue;
case STRING:
try
{
return Integer.parseInt(stringValue);
}
catch (Exception e)
{
return 0;
}
default:
return 0;
}
}
public double getDouble()
{
switch (type)
{
case BOOLEAN:
return booleanValue? 1 : 0;
case INTEGER:
return integerValue;
case DOUBLE:
return doubleValue;
case STRING:
try
{
return Double.parseDouble(stringValue);
}
catch (Exception e)
{
return 0;
}
default:
return 0;
}
}
public String getString()
{
switch (type)
{
case BOOLEAN:
return Boolean.toString(booleanValue);
case INTEGER:
return Integer.toString(integerValue);
case DOUBLE:
return Double.toString(doubleValue);
case STRING:
return stringValue;
default:
return "";
}
}
public Value and(Value other)
{
return new Value(getBoolean() && other.getBoolean());
}
public Value or(Value other)
{
return new Value(getBoolean() || other.getBoolean());
}
public Value eq(Value other)
{
switch (type)
{
case BOOLEAN:
if (other.type == ValueType.BOOLEAN)
{
return new Value(booleanValue == other.booleanValue);
} //if
else
{
return new Value(booleanValue == other.getBoolean());
} //else
case INTEGER:
if (other.type == ValueType.INTEGER)
{
return new Value(integerValue == other.integerValue);
} //if
else
{
return new Value(integerValue == other.getInteger());
} //else
case DOUBLE:
if (other.type == ValueType.DOUBLE)
{
return new Value(doubleValue == other.doubleValue);
} //if
else
{
return new Value(doubleValue == other.getDouble());
} //else
case STRING:
if (other.type == ValueType.STRING)
{
return new Value(stringValue.equals(other.stringValue));
} //if
else
{
return new Value(stringValue.equals(other.getString()));
} //else
default:
return new Value(false);
} //switch
}
public Value neq(Value other)
{
switch (type)
{
case BOOLEAN:
if (other.type == ValueType.BOOLEAN)
{
return new Value(booleanValue != other.booleanValue);
} //if
else
{
return new Value(booleanValue != other.getBoolean());
} //else
case INTEGER:
if (other.type == ValueType.INTEGER)
{
return new Value(integerValue != other.integerValue);
} //if
else
{
return new Value(integerValue != other.getInteger());
} //else
case DOUBLE:
if (other.type == ValueType.DOUBLE)
{
return new Value(doubleValue != other.doubleValue);
} //if
else
{
return new Value(doubleValue != other.getDouble());
} //else
case STRING:
if (other.type == ValueType.STRING)
{
return new Value(!stringValue.equals(other.stringValue));
} //if
else
{
return new Value(!stringValue.equals(other.getString()));
} //else
default:
return new Value(false);
} //switch
}
public Value lt(Value other)
{
switch (type)
{
case BOOLEAN:
return new Value(getInteger() < other.getDouble());
case INTEGER:
if (other.type == ValueType.INTEGER)
{
return new Value(integerValue < other.integerValue);
} //if
else
{
return new Value(integerValue < other.getDouble());
} //else
case DOUBLE:
if (other.type == ValueType.DOUBLE)
{
return new Value(doubleValue < other.doubleValue);
} //if
else
{
return new Value(doubleValue < other.getDouble());
} //else
case STRING:
return new Value(getDouble() < other.getDouble());
default:
return new Value(false);
} //switch
}
public Value gt(Value other)
{
switch (type)
{
case BOOLEAN:
return new Value(getInteger() > other.getDouble());
case INTEGER:
if (other.type == ValueType.INTEGER)
{
return new Value(integerValue > other.integerValue);
} //if
else
{
return new Value(integerValue > other.getDouble());
} //else
case DOUBLE:
if (other.type == ValueType.DOUBLE)
{
return new Value(doubleValue > other.doubleValue);
} //if
else
{
return new Value(doubleValue > other.getDouble());
} //else
case STRING:
return new Value(getDouble() > other.getDouble());
default:
return new Value(false);
} //switch
}
public Value lteq(Value other)
{
switch (type)
{
case BOOLEAN:
return new Value(getInteger() <= other.getDouble());
case INTEGER:
if (other.type == ValueType.INTEGER)
{
return new Value(integerValue <= other.integerValue);
} //if
else
{
return new Value(integerValue <= other.getDouble());
} //else
case DOUBLE:
if (other.type == ValueType.DOUBLE)
{
return new Value(doubleValue <= other.doubleValue);
} //if
else
{
return new Value(doubleValue <= other.getDouble());
} //else
case STRING:
return new Value(getDouble() <= other.getDouble());
default:
return new Value(false);
} //switch
}
public Value gteq(Value other)
{
switch (type)
{
case BOOLEAN:
return new Value(getInteger() >= other.getDouble());
case INTEGER:
if (other.type == ValueType.INTEGER)
{
return new Value(integerValue >= other.integerValue);
} //if
else
{
return new Value(integerValue >= other.getDouble());
} //else
case DOUBLE:
if (other.type == ValueType.DOUBLE)
{
return new Value(doubleValue >= other.doubleValue);
} //if
else
{
return new Value(doubleValue >= other.getDouble());
} //else
case STRING:
return new Value(getDouble() >= other.getDouble());
default:
return new Value(false);
} //switch
}
public Value not()
{
if (type == ValueType.BOOLEAN)
{
return new Value(!booleanValue);
} //if
else
{
return new Value(!getBoolean());
} //else
}
public Value add(Value other)
{
switch (type)
{
case DOUBLE:
return new Value(doubleValue + other.getDouble());
case STRING:
return new Value(stringValue + other.getString());
default:
switch (other.type)
{
case DOUBLE:
return new Value(getInteger() + other.doubleValue);
case STRING:
if (other.stringValue.contains("."))
{
try
{
return new Value(getInteger() + Double.parseDouble(other.stringValue));
} //try
catch (Exception e)
{
new Value(getString() + other.stringValue);
} //catch
} //if
else
{
try
{
return new Value(getInteger() + Integer.parseInt(other.stringValue));
} //try
catch (Exception e)
{
return new Value(getString() + other.stringValue);
} //catch
} //else
default:
return new Value(getInteger() + other.getInteger());
}
} //switch
}
public Value sub(Value other)
{
switch (type)
{
case DOUBLE:
return new Value(doubleValue - other.getDouble());
case STRING:
return new Value(stringValue.replace(other.getString(), ""));
default:
switch (other.type)
{
case DOUBLE:
return new Value(getInteger() - other.doubleValue);
case STRING:
if (other.stringValue.contains("."))
{
try
{
return new Value(getInteger() - Double.parseDouble(other.stringValue));
} //try
catch (Exception e)
{
return new Value(getString().replace(other.stringValue, ""));
} //catch
} //if
else
{
try
{
return new Value(getInteger() - Integer.parseInt(other.stringValue));
} //try
catch (Exception e)
{
return new Value(getString().replace(other.stringValue, ""));
} //catch
} //else
default:
return new Value(getInteger() - other.getInteger());
}
} //switch
}
public String cross(String set)
{
String newString = "";
for (byte b : stringValue.getBytes())
{
newString = newString + b + set;
} //for
return newString;
}
public String multiplyString(int multiple)
{
String newString = "";
for (int i = 0; i < multiple; ++i)
{
newString = newString + stringValue;
} //for
return newString;
} //multiplyString
public Value mul(Value other)
{
switch (type)
{
case DOUBLE:
return new Value(doubleValue*other.getDouble());
case STRING:
switch (other.type)
{
case STRING:
return new Value(cross(other.getString()));
default:
return new Value(multiplyString(other.getInteger()));
}
default:
switch (other.type)
{
case DOUBLE:
return new Value(getInteger()*other.doubleValue);
case STRING:
if (other.stringValue.contains("."))
{
try
{
return new Value(getInteger()*Double.parseDouble(other.stringValue));
} //try
catch (Exception e)
{
return new Value(cross(other.getString()));
} //catch
} //if
else
{
try
{
return new Value(getInteger()*Integer.parseInt(other.stringValue));
} //try
catch (Exception e)
{
return new Value(cross(other.getString()));
} //catch
} //else
default:
return new Value(getInteger()*other.getInteger());
}
} //switch
}
public String intersect(String set)
{
byte[] bytes = set.getBytes();
int len = bytes.length;
String newString = "";
for (byte b : stringValue.getBytes())
{
for (int i = 0; i < len; ++i)
{
if (b == bytes[i])
{
newString += b;
bytes[i] = bytes[--len];
} //if
} //for
}
return newString;
}
public Value div(Value other)
{
switch (type)
{
case DOUBLE:
return new Value(doubleValue/other.getDouble());
case STRING:
return new Value(intersect(other.getString()));
default:
switch (other.type)
{
case DOUBLE:
return new Value(getInteger()/other.doubleValue);
case STRING:
if (other.stringValue.contains("."))
{
try
{
return new Value(getInteger()/Double.parseDouble(other.stringValue));
} //try
catch (Exception e)
{
return new Value(intersect(other.getString()));
} //catch
} //if
else
{
try
{
return new Value(getInteger()/Integer.parseInt(other.stringValue));
} //try
catch (Exception e)
{
return new Value(intersect(other.getString()));
} //catch
} //else
default:
return new Value(getInteger()/other.getInteger());
}
} //switch
}
public String disjoint(String set)
{
String oldString = stringValue;
int index;
for (byte b : stringValue.getBytes())
{
index = set.indexOf(b);
if (index != -1)
{
set = set.substring(0, index) + set.substring(index + 1);
} //if
index = oldString.indexOf(b);
oldString = oldString.substring(0, index) + oldString.substring(index + 1);
} //for
return oldString + set;
} //disjoint
public Value mod(Value other)
{
switch (type)
{
case DOUBLE:
return new Value(doubleValue%other.getDouble());
case STRING:
return new Value(disjoint(other.getString()));
default:
switch (other.type)
{
case DOUBLE:
return new Value(getInteger()%other.doubleValue);
case STRING:
if (other.stringValue.contains("."))
{
try
{
return new Value(getInteger()%Double.parseDouble(other.stringValue));
} //try
catch (Exception e)
{
return new Value(disjoint(other.getString()));
} //catch
} //if
else
{
try
{
return new Value(getInteger()%Integer.parseInt(other.stringValue));
} //try
catch (Exception e)
{
return new Value(disjoint(other.getString()));
} //catch
} //else
default:
return new Value(getInteger()%other.getInteger());
}
} //switch
}
}
public Value interpretVariable(String text)
{
if (text.charAt(0) == '"')
{
return new Value(text.substring(1, text.length() - 1));
} //if
else if (text == "true" || text == "false")
{
return new Value(text == "true");
} //else if
else if (text.contains("."))
{
return new Value(Double.parseDouble(text));
} //else if
else
{
return new Value(Integer.parseInt(text));
} //else
}
public ExprParser(TokenStreamRewriter rewriter, AntlrLexer.Environment environment)
{
super(rewriter.getTokenStream());
this.rewriter = rewriter;
this.environment = environment;
_interp = new ParserATNSimulator(this,_ATN,_decisionToDFA,_sharedContextCache);
}
}
start returns [Value v]
: o=expr {$v = $o.v;}
;
expr returns [Value v]
: '!' o=expr {$v = $o.v.not();}
| left=expr '%' right=expr {$v = $left.v.mod($right.v);}
| left=expr '/' right=expr {$v = $left.v.div($right.v);}
| left=expr '*' right=expr {$v = $left.v.mul($right.v);}
| left=expr '+' right=expr {$v = $left.v.add($right.v);}
| left=expr '-' right=expr {$v = $left.v.sub($right.v);}
| left=expr '<' right=expr {$v = $left.v.lt($right.v);}
| left=expr '<=' right=expr {$v = $left.v.lteq($right.v);}
| left=expr '>' right=expr {$v = $left.v.gt($right.v);}
| left=expr '>=' right=expr {$v = $left.v.gteq($right.v);}
| left=expr '==' right=expr {$v = $left.v.eq($right.v);}
| left=expr '!=' right=expr {$v = $left.v.neq($right.v);}
| left=expr '&&' right=expr {$v = $left.v.and($right.v);}
| left=expr '||' right=expr {$v = $left.v.or($right.v);}
| VARIABLE '=' right=expr {$v = $right.v; environment.define($VARIABLE.text, $right.v.getString());}
| '(' o=expr ')' {$v = $o.v;}
| STRING {$v = new Value($STRING.text.substring(1, $STRING.text.length() - 1));}
| INTEGER {$v = new Value(Integer.parseInt($INTEGER.text));}
| DOUBLE {$v = new Value(Double.parseDouble($DOUBLE.text));}
| BOOLEAN {$v = new Value(Boolean.parseBoolean($BOOLEAN.text));}
| VARIABLE {$v = interpretVariable(environment.get($VARIABLE.text));}
;
BOOLEAN : 'true'|'false';
INTEGER : [1-9]?[0-9]+;
DOUBLE : [1-9]?[0-9]+'.'[0-9]*;
STRING : '"' (~[\\"] | '\\' .)* '"'
{
_text = _input.getText(Interval.of(_tokenStartCharIndex, _input.index() - 1));
_text = _text.replace("\\n", "\n");
_text = _text.replace("\\r", "\r");
_text = _text.replace("\\t", "\t");
_text = _text.replace("\\b", "\b");
_text = _text.replace("\\f", "\f");
_text = _text.replaceAll("\\\\(.)", "$1");
}
;
VARIABLE: [a-zA-Z_][a-zA-Z_0-9]+;
WS : [ \r\n\t] -> skip;
COMMENTS :
( '/*' .*? '*/'
| '//' ~[\r\n]*
)+
-> skip
;
Here is the code for the preprocessor.
Java:
/*
* to do
*
* string interpolation
*
* packages (dread)
*
* save the package into the package table (the string)
* recall the package later on
*
* import will also import packages****
*
* variable assignments
*
* evaluate
*/
// output is a list of script tokens
//
// import file with arguments
//
// #<import("filename" "arg1" "arg2" "arg3" arg4)>
//
// read variable (string interpolation)
//
// #<$var>
//
// assign variable
//
// #<assign $var expression>
//
// expression
//
// $variable + 19/2
//
// conditionals (no " " == var, " " == literal)
// else statement is just an empty expression (null == true)
// false is 0 or "false"
// true is anything else
// a string that isn't an integer running off of a comparison will be 0
// you may nest conditional
//
// ==, !=, <=, >=, <, >, &&, ||
//
// #<if (expression)>
// #<end>
//
// create package with arguments
//
// #<package name>
// <arguments>
// <end>
// #<end>
//
// import package with arguments (inherits current file/package arguments)
//
// #<implement packageName("arg1" "arg2" "arg3" arg4)>
lexer grammar AntlrLexer;
@header
{
import org.antlr.v4.runtime.ANTLRFileStream;
import org.antlr.v4.runtime.CommonTokenStream;
import java.util.HashMap;
import java.util.Stack;
import java.util.LinkedList;
import java.util.Map;
}
tokens
{
SCRIPT
}
@members
{
private ExprParser.Value evaluate(String expr)
{
if (expr == null || expr == "")
{
return new ExprParser.Value(true);
}
ExprLexer lexer = new ExprLexer(new ANTLRInputStream(expr));
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
TokenStreamRewriter rewriter = new TokenStreamRewriter(tokenStream);
tokenStream.fill();
return new ExprParser(rewriter, environment).start().v;
}
private class SymbolTable
{
private Stack<HashMap<String, String>> symbols = new Stack<HashMap<String, String>>();
private HashMap<String,String> symbolTable = new HashMap<String, String>();
public void push()
{
symbols.push(symbolTable);
symbolTable = new HashMap<String, String>();
}
public void pushInherit()
{
HashMap<String,String> symbolTable = new HashMap<String, String>();
inherit(symbolTable, this.symbolTable);
symbols.push(this.symbolTable);
this.symbolTable = symbolTable;
}
public void pop()
{
symbolTable = symbols.pop();
}
public void define(String symbol, String value)
{
symbolTable.put(symbol, value);
}
public void undefine(String symbol)
{
symbolTable.remove(symbol);
}
public String get(String symbol)
{
if (!symbolTable.containsKey(symbol))
{
error("Invalid variable name: '" + symbol + "'");
return null;
} //if
return symbolTable.get(symbol);
}
public void inherit(HashMap<String,String> child, HashMap<String,String> parent)
{
for (Map.Entry<String, String> entry : parent.entrySet()) {
child.put(entry.getKey(), entry.getValue());
}
}
}
public class Environment
{
private class InputState
{
public final int line;
public final int charPosition;
public final CharStream input;
public final Pair<TokenSource, CharStream> tokenFactory;
public InputState()
{
line = _interp.getLine();
charPosition = _interp.getCharPositionInLine();
input = _input;
tokenFactory = _tokenFactorySourcePair;
}
public void load()
{
_input = input;
_tokenFactorySourcePair = tokenFactory;
_interp.setLine(line);
_interp.setCharPositionInLine(charPosition);
}
}
private SymbolTable symbolTable = new SymbolTable();
private SymbolTable packageTable = new SymbolTable();
private Stack<InputState> inputStates = new Stack<InputState>();
private LinkedList<String> args = new LinkedList<String>();
public boolean openPackage(String whichPackage)
{
ANTLRInputStream input = null;
try
{
input = new ANTLRInputStream(packageTable.get(whichPackage));
}
catch (Exception e)
{
e.printStackTrace();
}
if (input == null)
{
return false;
}
/*
* replace input
*/
inputStates.push(new InputState());
_input = input;
_interp.setLine(0);
_interp.setCharPositionInLine(0);
/*
* replace symbols
*/
symbolTable.pushInherit();
packageTable.pushInherit();
/*
* go to top mode
*/
pushMode(0);
return true;
}
public boolean open(String filename)
{
ANTLRFileStream input = null;
try
{
input = new ANTLRFileStream(filename);
}
catch (Exception e)
{
e.printStackTrace();
}
if (input == null)
{
return false;
}
/*
* replace input
*/
inputStates.push(new InputState());
_input = input;
_tokenFactorySourcePair = new Pair<TokenSource, CharStream>(AntlrLexer.this, input);
_interp.setLine(0);
_interp.setCharPositionInLine(0);
/*
* replace symbols
*/
symbolTable.push();
packageTable.push();
/*
* go to top mode
*/
pushMode(0);
return true;
}
public boolean close()
{
if (inputStates.isEmpty())
{
return false;
}
/*
* load previous input
*/
inputStates.pop().load();
/*
* load previous symbols
*/
symbolTable.pop();
packageTable.pop();
/*
* go to previous mode
*/
popMode();
_hitEOF = false;
return true;
}
public void define(String symbol, String value)
{
if (value != null)
{
symbolTable.define(symbol, value);
}
}
public void undefine(String symbol)
{
symbolTable.undefine(symbol);
}
public String get(String symbol)
{
return symbolTable.get(symbol);
}
public void pushArg(String arg)
{
args.addLast(arg);
}
public String popArg()
{
if (args.isEmpty())
{
return null;
}
return args.pop();
}
public void clearArgs()
{
args.clear();
}
public boolean isEmpty()
{
return inputStates.isEmpty();
}
}
/*
* this manages
*
* input
* symbol table
*/
private Environment environment = new Environment();
/*
* override to close current input when at EOF as there may be multiple
* inputs
*/
@Override
public Token nextToken()
{
Token token = super.nextToken();
while (token.getType() == -1 && environment.close())
{
token = super.nextToken();
}
return token;
}
private class BlockState
{
public final String close;
public BlockState(String close)
{
this.close = close;
}
}
private java.util.Stack<BlockState> block = new java.util.Stack<BlockState>();
private java.util.Stack<Boolean> enabled = new java.util.Stack<Boolean>();
private java.util.Stack<ScriptBlock> scriptBlock = new java.util.Stack<ScriptBlock>();
private class ScriptBlock
{
public ScriptBlock()
{
scriptBlock.push(this);
}
public void onExit()
{
scriptBlock.pop();
}
} //ScriptBlock
private class IfBlock extends ScriptBlock
{
public IfBlock(boolean enable)
{
super();
enabled.push(enable);
} //IfBlock
@Override public void onExit()
{
super.onExit();
enabled.pop();
}
}
public void popBlock()
{
if (!scriptBlock.isEmpty())
{
scriptBlock.peek().onExit();
} //if
else
{
error("Attempt to close a block that does not exist");
} //else
}
public boolean isEnabled()
{
return enabled.isEmpty() || enabled.peek();
}
private boolean valid = true;
public boolean isValid() { return valid; }
private void error(final String message)
{
valid = false;
getErrorListenerDispatch().syntaxError(
AntlrLexer.this,
null,
_tokenStartLine,
_tokenStartCharPositionInLine,
message + ": " + getCurrentText(),
null
);
}
private String getCurrentText(int start, int end)
{
return _input.getText(Interval.of(_tokenStartCharIndex + start, _input.index() + end));
}
private String getCurrentText()
{
return _input.getText(Interval.of(_tokenStartCharIndex, _input.index()));
}
private void checkForClose()
{
if (!block.isEmpty() && _input.LA(2) == EOF && environment.isEmpty())
{
error("Missing closing '" + block.peek().close + "'");
pop(-1, false);
}
if (!scriptBlock.isEmpty() && _input.LA(2) == EOF)
{
error("Missing closing '<end>'");
popBlock();
}
}
//or
private boolean la(String ... ts)
{
if (ts != null)
{
int i = 0;
int len = 0;
byte ahead;
for (String s : ts)
{
i = 0;
len = s.length();
while (i < len)
{
ahead = (byte)_input.LA(1 + i);
if (ahead == -1 || ahead != s.charAt(i))
{
len = 0;
}
else
{
++i;
}
}
if (len > 0)
{
return true;
}
}
if (len == 0)
{
return false;
}
}
return true;
}
//and
private boolean nla(String ... ts)
{
if (ts != null)
{
int i = 0;
int len = 0;
byte ahead;
for (String s : ts)
{
i = 0;
len = s.length();
while (i < len)
{
ahead = (byte)_input.LA(1 + i);
if (ahead != -1 && ahead != s.charAt(i))
{
len = 0;
}
else
{
++i;
}
}
if (len > 0)
{
return false;
}
}
}
return true;
}
private boolean cont(int t, boolean o)
{
if (o)
{
more();
}
else if (t < 0 || !isEnabled())
{
skip();
}
else
{
_type = t;
}
checkForClose();
return o;
}
private boolean push(String c, int m, int t, boolean o)
{
boolean enabled = isEnabled();
block.push(new BlockState(c));
pushMode(m);
cont(t, o);
return o;
}
private boolean pop(int t, boolean o)
{
block.pop();
popMode();
cont(t, o);
return o;
}
private boolean cont(int t, String ... ts)
{
return cont(t, la(ts));
}
private boolean ncont(int t, String ... ts)
{
return cont(t, nla(ts));
}
private boolean push(String c, int m, int t, String ... ts)
{
return push(c, m, t, la(ts));
}
private boolean npush(String c, int m, int t, String ... ts)
{
return push(c, m, t, nla(ts));
}
private boolean pop(int t, String ... ts)
{
return pop(t, la(ts));
}
private boolean npop(int t, String ... ts)
{
return pop(t, nla(ts));
}
public String expression = "";
public int expressionDepth = 0;
}
/*
* if type is less than 0, skip
* type only matters if not continue
*
* - params
* String ... searchStrings
* String closingString
*
* int useType
* int goToMode
*
* boolean continue
*
* - conditions
* private boolean la(stringsThatMustBeFound)
* returns true if any of these are found
*
* private boolean nla(stringsThatMustNotBeFound)
* returns true if none of the strings are found
*
* - includes next character in token if continue
* - when a string is provided instead of a boolean, the condition for that
* - string must be true for the next text to be included in the current token
* boolean cont(useType, continue?)
* boolean cont(goToMode, stringsThatMustBeFound)
* boolean ncont(goToMode, stringsThatMustNotBeFound)
*
* - goes to mode and consumes next token if continue
* - a type < 0 is skipped
* boolean push(closingString, goToMode, useType, continue?)
* boolean push(closingString, goToMode, useType, stringsThatMustBeFound)
* boolean npush(String closingString, goToMode, useType, stringsThatMustNotBeFound)
*
* - pops from mode and consumes next token if continue
* boolean pop(useType, continue?)
* boolean pop(useType, stringsThatMustBeFound)
* boolean npop(useType, stringsThatMustNotBeFound)
*
* boolean environment.open(filename)
* environment.define(symbol, value)
* environment.undefine(symbol)
* string environment.get(symbol)
* environment.pushArg(arg)
* String environment.popArg()
* environment.clearArgs()
* String getCurrentText()
* String getCurrentText(start, end)
*
* - tokens are skipped if enabled is false
* boolean isEnabled()
* void pushEnabled(boolean enable)
* void popEnabled()
*/
//header
WS : [ \t\r\n]+
-> skip
;
COMMENTS : ( '/*' .*? '*/'
| '//' ~[\r\n]*
)+
-> skip
;
ARGUMENTS : '#<arguments>'
{
_mode = Normal;
push("#<end>", Arguments, -1, false);
}
;
ANY : .
{
_mode = Normal;
_input.seek(_input.index() - 1);
skip();
}
;
//script
// #{ }
//args
// #[ ]
mode Normal //Antlr script
;
//this will continue to consume characters for a given token until the #< sequence is hit
CHAR_SEQUENCE : ( ~[`\'"\[\]\\*/#]
| WS
| COMMENTS
| '`' (~[\\`] | '\\' .)* '`' // ` `
| '"' (~[\\"] | '\\' .)* '"' // " "
| '[' (~[\\\]\[] | '\\' .)* ']' // [ ]
| '\'' (~[\\\'] | '\\' .)* '\'' // ' '
| '#'
| '/'
| '*'
)
{
ncont(SCRIPT, "#<");
}
;
PARAM_START : '['
{
push("]", Param, SCRIPT, true);
}
;
PRE_START : '#<'
{
push(">", Pre, -1, false);
skip();
}
;
mode Arguments //<arguments> <end>
;
Arguments_WS : (WS | COMMENTS)+
{
skip();
}
;
Arguments_RBRACK : '#<end>'
{
if (isEnabled())
{
environment.clearArgs();
}
pop(-1, false);
}
;
//the following will plug an argument into a symbol defined by the argument
//label
//[Symbol = InputArgument]
//Arguments are pushed by import and popped by this. If not all arguments are used, the arguments
//get cleared.
Arguments_ARGUMENT : [_a-zA-Z0-9]+
{
if (isEnabled())
{
environment.define(getCurrentText(0, -1), environment.popArg());
}
skip();
}
;
mode Param //[ ] from Antlr
;
Param_ANY : ( ~[\][#<]
| '\\' (~[\]] | EOF)
| '#' (~[<] | EOF)
)+
{
ncont(SCRIPT, "]");
}
;
Param_PARAM_START : '['
{
npush("]", Param, SCRIPT, "#<");
}
;
Param_END : ']'
{
npop(SCRIPT, "#<");
}
;
Param_PRE_START : '#<'
{
push(">", Pre, -1, false);
skip();
}
;
//#<import(file, args...)>
mode ImportStart
;
ImportStart_WS : (WS | COMMENTS)+ {skip();};
ImportStart_Parens : '(' {_mode = Import; skip();};
mode Import
;
Import_WS : (WS | COMMENTS)+ {skip();};
Import_FILE : '"' (~[\\"] | '\\' .)* '"'
{
if (isEnabled())
{
_text = getCurrentText(1, -2);
_text = _text.replace("\\n", "\n");
_text = _text.replace("\\r", "\r");
_text = _text.replace("\\t", "\t");
_text = _text.replace("\\b", "\b");
_text = _text.replace("\\f", "\f");
_text = _text.replaceAll("\\\\(.)", "$1");
environment.pushArg(_text);
}
skip();
_mode = ImportArg;
}
;
Imporg_ARG_READ : [_a-zA-Z0-9]+
{
if (isEnabled())
{
environment.pushArg(environment.get(getCurrentText(0, -1)));
}
skip();
_mode = ImportArg;
}
;
Import_END : ')'
{
pop(-1, false);
skip();
if (isEnabled())
{
environment.open(environment.popArg());
}
}
;
mode ImportArg
;
ImportArg_WS : (WS | COMMENTS)+ {skip();};
ImportArg_ARG : '"' (~[\\"] | '\\' .)* '"'
{
if (isEnabled())
{
_text = getCurrentText(1, -2);
_text = _text.replace("\\n", "\n");
_text = _text.replace("\\r", "\r");
_text = _text.replace("\\t", "\t");
_text = _text.replace("\\b", "\b");
_text = _text.replace("\\f", "\f");
_text = _text.replaceAll("\\\\(.)", "$1");
environment.pushArg(_text);
}
skip();
}
;
ImportArg_ARG_READ : [_a-zA-Z0-9]+
{
if (isEnabled())
{
environment.pushArg(environment.get(getCurrentText(0, -1)));
}
skip();
}
;
ImportArg_END : ')'
{
pop(-1, false);
skip();
if (isEnabled())
{
environment.open(environment.popArg());
}
}
;
mode Arg
;
Arg_WS : (WS | COMMENTS)+ {skip();};
Arg_VAL : [_a-zA-Z0-9]+
{
if (isEnabled())
{
String value = environment.get(getCurrentText(0, -1));
//System.out.println("value: " + value);
if (value == null)
{
skip();
} //if
else
{
_text = value;
_type = SCRIPT;
} //else
}
else
{
skip();
}
popMode();
}
;
mode EvalStart;
EvalStart_WS : (WS | COMMENTS)+ {skip();};
EvalStart_Parens : '('
{
_mode = Eval;
skip();
expression = "";
expressionDepth = 0;
};
mode Eval
;
Eval_WS : (WS | COMMENTS)+ {skip();};
Eval_EXPR : (~["()]
| '"' (~[\\"] | '\\' .)* '"')
{
expression += getCurrentText(0, -1);
skip();
}
;
Eval_EXPR_PARENS : '('
{
++expressionDepth;
push(")", Eval, -1, false);
expression += "(";
}
;
Eval_END : ')'
{
pop(-1, false);
if (expressionDepth-- > 0)
{
expression += ")";
}
else
{
ExprParser.Value value = evaluate(expression);
System.out.println("Value: " + value.getString() + " == " + expression);
new IfBlock(value.getBoolean());
}
}
;
mode Package
;
Package_WS : (WS | COMMENTS)+ {skip();};
Package_FILE : '"' (~[\\"] | '\\' .)* '"'
{
if (isEnabled())
{
_text = getCurrentText(1, -2);
_text = _text.replace("\\n", "\n");
_text = _text.replace("\\r", "\r");
_text = _text.replace("\\t", "\t");
_text = _text.replace("\\b", "\b");
_text = _text.replace("\\f", "\f");
_text = _text.replace("\\\"", "\"");
_text = _text.replaceAll("\\(.)", "$1");
environment.pushArg(_text);
}
skip();
_mode = PackageArg;
}
;
Package_FILE_READ : [_a-zA-Z0-9]+
{
if (isEnabled())
{
environment.pushArg(environment.get(getCurrentText()));
}
skip();
}
;
Package_END : '#`}'
{
pop(-1, false);
if (isEnabled())
{
environment.openPackage(environment.popArg());
}
}
;
mode PackageArg
;
PackageArg_WS : (WS | COMMENTS)+ {skip();};
PackageArg_ARG : '"' (~[\\"] | '\\' .)* '"'
{
if (isEnabled())
{
_text = getCurrentText(1, -2);
_text = _text.replace("\\n", "\n");
_text = _text.replace("\\r", "\r");
_text = _text.replace("\\t", "\t");
_text = _text.replace("\\b", "\b");
_text = _text.replace("\\f", "\f");
_text = _text.replace("\\\"", "\"");
_text = _text.replaceAll("\\(.)", "$1");
environment.pushArg(_text);
}
skip();
}
;
PackageArg_ARG_READ : [_a-zA-Z0-9]+
{
if (isEnabled())
{
environment.pushArg(environment.get(getCurrentText()));
}
skip();
}
;
PackageArg_END : '#`}'
{
pop(-1, false);
if (isEnabled())
{
environment.openPackage(environment.popArg());
}
}
;
mode Pre
;
Pre_WS : (WS | COMMENTS)+ {skip();};
Pre_IMPORT_START : 'import'
{
_mode = PreEnd;
push(")", ImportStart, -1, false);
}
;
Pre_PACKAGE_START : {false}? 'package'
{
_mode = PreEnd;
//TO DO
//popMode();
//push("#`}", Package, -1, false);
}
;
Pre_ARG_START : '$'
{
_mode = PreEnd;
pushMode(Arg);
skip();
}
;
Pre_EVAL_START : 'if'
{
_mode = PreEnd;
push(")", EvalStart, -1, false);
skip();
}
;
Pre_END : 'end'
{
_mode = PreEnd;
skip();
popBlock();
}
;
mode PreEnd
;
PreEnd_WS : (WS | COMMENTS)+ {skip();};
PreEnd_End : '>'
{
pop(-1, false);
skip();
}
;
If anyone ever wants to start working with Antlr4 and build their own compilers in the future, I really recommend you use this stuff : P. It'll make your life way easier.
I have the .jar if anyone wants me to attach it.
Also, I figure I'll put up my little test grammar thing. This one's a bit better than the default Antlr4 test harness. Running this thing without any arguments will tell you how to use it.
This is an example. Note here that -lexer Antolr expects a file called AntlrLexer.
grammar/input.g4 -channel Channel -lexer Antlr -package compile.antlr -tokens
Java:
package compile.antlr;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.Reader;
import java.lang.reflect.Constructor;
import java.lang.reflect.Method;
import java.lang.reflect.Modifier;
import java.util.LinkedList;
import java.util.List;
import org.antlr.v4.runtime.ANTLRInputStream;
import org.antlr.v4.runtime.CharStream;
import org.antlr.v4.runtime.CommonToken;
import org.antlr.v4.runtime.CommonTokenStream;
import org.antlr.v4.runtime.DiagnosticErrorListener;
import org.antlr.v4.runtime.Lexer;
import org.antlr.v4.runtime.Parser;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.Token;
import org.antlr.v4.runtime.TokenStream;
import org.antlr.v4.runtime.atn.PredictionMode;
public class TestGrammar
{
private Class<? extends Lexer> lexerClass;
private Class<? extends Parser> parserClass;
private Lexer lexer;
private Parser parser;
private boolean arg_tree = false;
private boolean arg_tokens = false;
private boolean arg_gui = false;
private String arg_ps = null;
private String arg_encoding = null;
private boolean arg_trace = false;
private boolean arg_diagnostics = false;
private boolean arg_sll = false;
private String arg_channel = null;
private String arg_lexer;
private String arg_parser;
private String arg_package;
private List<String> arg_input = new LinkedList<String>();
private CommonTokenStream tokens;
private List<Token> tokenList;
private String parserRule;
private String[] tokenNames;
private String[] channelNames = null;
private int getValueWidth()
{
int max = 0;
int len;
String str;
for (Token token : tokenList)
{
str = token.getText();
str = str.replace("\n", "\\n");
str = str.replace("\t", "\\t");
str = str.replace("\r", "\\r");
((CommonToken) token).setText(str);
len = token.getText().length();
if (len > max)
{
max = len;
}
}
return max;
}
private int getTypeWidth()
{
int max = 0;
int type;
int len;
for (Token token : tokenList)
{
type = token.getType();
if (type == -1)
{
len = 3;
}
else
{
len = tokenNames[type].length();
}
if (len > max)
{
max = len;
}
}
return max;
}
private int getChannelWidth()
{
if (channelNames == null)
{
return 4;
}
int max = 0;
int len;
for (Token token : tokenList)
{
len = channelNames[token.getChannel()].length();
if (len > max)
{
max = len;
}
}
return max;
}
private static void printex(String msg, int maxlen, int spacing)
{
int strlen = msg == null || msg == ""? 0 : msg.length();
int len = 0;
char[] str = msg.toCharArray();
while (len < strlen && len < maxlen)
{
if (str[len] == '\t' || str[len] == '\r' || str[len] == '\n')
{
str[len] = ' ';
}
System.out.print(str[len++]);
}
while (len++ < maxlen)
{
System.out.print(' ');
}
for (int i = spacing; i > 0; --i)
{
System.out.print(' ');
}
}
private static void printex(char c, int len)
{
while (len-- > 0)
{
System.out.print('-');
}
}
private void printTokens(String tabs)
{
if (arg_tokens)
{
tokenList = tokens.getTokens();
final int spacing = 8;
final int typeWidth = getTypeWidth();
final int valueWidth = getValueWidth() + 2;
final int channelWidth = getChannelWidth();
final int width = typeWidth + valueWidth + channelWidth + spacing + spacing;
int type;
System.out.print(tabs + "Tokens {\n");
System.out.print(tabs + "\t");
printex("Type", typeWidth, spacing);
printex("Value", valueWidth, spacing);
printex("Channel", channelWidth, 0);
System.out.println();
System.out.print(tabs + "\t");
printex('-', width);
System.out.print("\n\n");
for (Token token : tokenList)
{
type = token.getType();
System.out.print(tabs + "\t");
printex(type == -1? "EOF" : tokenNames[type], typeWidth, spacing);
printex("|" + token.getText() + "|", valueWidth, spacing);
if (channelNames != null)
{
printex(channelNames[token.getChannel()], channelWidth, 0);
}
else
{
printex(Integer.toString(token.getChannel()), channelWidth, 0);
}
System.out.print('\n');
}
System.out.print(tabs + "}\n");
}
}
private boolean evaluateArgs_assert(String[] args, final int i, final String expected)
{
if (args[i].equals(expected))
{
return true;
}
System.err.println("Expecting [" + expected + "], got [" + args[i] + "]");
return false;
}
private void evaluateArgs_error(final String arg, final String expected)
{
System.err.println("Expecting " + expected + ", got [" + arg + "]");
}
private int evaluateArgs_grammar(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-grammar"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_lexer = args[i];
arg_parser = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[grammarName]");
}
}
else
{
evaluateArgs_error("nothing", "[grammarName]");
}
}
return i;
}
private int evaluateArgs_lexer(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-lexer"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_lexer = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[lexerName]");
}
}
else
{
evaluateArgs_error("nothing", "[lexerName]");
}
}
return i;
}
private int evaluateArgs_parser(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-parser"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_parser = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[parserName]");
}
}
else
{
evaluateArgs_error("nothing", "[parserName]");
}
}
return i;
}
private int evaluateArgs_package(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-package"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_package = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[packageName]");
}
}
else
{
evaluateArgs_error("nothing", "[packageName]");
}
}
return i;
}
private int evaluateArgs_encoding(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-encoding"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_encoding = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[encodingName]");
}
}
else
{
evaluateArgs_error("nothing", "[encodingName]");
}
}
return i;
}
private int evaluateArgs_ps(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-ps"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_ps = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[psName]");
}
}
else
{
evaluateArgs_error("nothing", "[psName]");
}
}
return i;
}
private int evaluateArgs_channel(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-channel"))
{
++i;
if (i < args.length)
{
if (args[i].charAt(0) != '-')
{
arg_channel = args[i];
}
else
{
--i;
evaluateArgs_error(args[i], "[channelName]");
}
}
else
{
evaluateArgs_error("nothing", "[channelName]");
}
}
return i;
}
private int evaluateArgs_tokens(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-tokens"))
arg_tokens = true;
return i;
}
private int evaluateArgs_tree(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-tree"))
arg_tree = true;
return i;
}
private int evaluateArgs_gui(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-gui"))
arg_gui = true;
return i;
}
private int evaluateArgs_trace(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-"))
arg_trace = true;
return i;
}
private int evaluateArgs_diagnostics(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-diagnostics"))
arg_diagnostics = true;
return i;
}
private int evaluateArgs_SLL(final String args[], int i)
{
if (evaluateArgs_assert(args, i, "-SLL"))
arg_sll = true;
return i;
}
private int evaluateArgs_input(final String args[], int i)
{
if (args[i].charAt(0) != '-')
{
arg_input.add(args[i]);
}
return i;
}
private int evaluateArgs_g(final String args[], final int i, final int d)
{
switch (args[i].charAt(d))
{
case 'r':
return evaluateArgs_grammar(args, i);
case 'u':
return evaluateArgs_gui(args, i);
default:
evaluateArgs_error(args[i], "[-grammar] [-gui]");
}
return i;
}
private int evaluateArgs_pa(final String args[], final int i, final int d)
{
switch (args[i].charAt(d))
{
case 'r':
return evaluateArgs_parser(args, i);
case 'c':
return evaluateArgs_package(args, i);
default:
evaluateArgs_error(args[i], "[-parser] [-package]]");
}
return i;
}
private int evaluateArgs_p(final String args[], final int i, final int d)
{
switch (args[i].charAt(d))
{
case 'a':
return evaluateArgs_pa(args, i, d + 1);
case 's':
return evaluateArgs_ps(args, i);
default:
evaluateArgs_error(args[i], "[-parser] [-ps] [-package]");
}
return i;
}
private int evaluateArgs_tr(final String args[], final int i, final int d)
{
switch (args[i].charAt(d))
{
case 'e':
return evaluateArgs_tree(args, i);
case 'a':
return evaluateArgs_trace(args, i);
default:
evaluateArgs_error(args[i], "[-tree] [-trace]");
}
return i;
}
private int evaluateArgs_t(final String args[], final int i, final int d)
{
switch (args[i].charAt(d))
{
case 'o':
return evaluateArgs_tokens(args, i);
case 'r':
return evaluateArgs_tr(args, i, d + 1);
default:
evaluateArgs_error(args[i], "[-tokens] [-tree] [-trace]");
}
return i;
}
private int evaluateArgs_1(final String args[], final int i, final int d)
{
switch (args[i].charAt(d))
{
case 'g':
return evaluateArgs_g(args, i, d + 1);
case 'l':
return evaluateArgs_lexer(args, i);
case 'p':
return evaluateArgs_p(args, i, d + 1);
case 't':
return evaluateArgs_t(args, i, d + 1);
case 'e':
return evaluateArgs_encoding(args, i);
case 'd':
return evaluateArgs_diagnostics(args, i);
case 's':
return evaluateArgs_SLL(args, i);
case 'c':
return evaluateArgs_channel(args, i);
default:
evaluateArgs_error(args[i],
"[-grammar] [-lexer] [-parser] [-package] [-tokens] [-tree] [-gui] [-trace] [-diagnostics] [-SLL] [-ps] [-encoding]");
}
return i;
}
private int evaluateArgs_0(final String args[], final int i, final int d)
{
if (args[i].length() < 3)
{
evaluateArgs_error(args[i],
"[-grammar] [-lexer] [-parser] [-packcage] [-tokens] [-tree] [-gui] [-trace] [-diagnostics] [-SLL] [-ps] [-encoding]");
return i;
}
switch (args[i].charAt(d))
{
case '-':
return evaluateArgs_1(args, i, d + 1);
default:
return evaluateArgs_input(args, i);
}
}
private void evaluateArgs_len0(String args[])
{
if (args.length == 0)
{
System.err.print("Arguments\n-------------------------------------------------------------------------\n\n");
System.err.println("\t([-grammar grammarName] | [-lexer lexerName] [-parser parserName])");
System.err.println("\t[-package packageName]? [-ps psName]? [-encoding encodingName]? [-channel enumName]?");
System.err.println("\t[-tokens]? [-tree]? [-gui]? [-trace]? [-diagnostics]? [-SLL]?");
System.err.println("\t[input-filename]*");
System.err.print("\nDetails\n---------------------------------------------------------------------------\n\n");
System.err.println("\tA lexer of some sort, be it from -grammar or -lexer, must be passed in\n\n");
System.err.println("\t[-grammar grammarName]\n" + "\n\t\t" + "Will attempt to load both lexer and parser of name [grammarName]"
+ "\n\t\t" + "The loaded grammar will be the last appearing [-grammar] argument" + "\n\n");
System.err.println("\t[-lexer lexerName]\n" + "\n\t\t" + "Will attempt to load the lexer of name [lexerName]" + "\n\t\t"
+ "The loaded lexer will be the last appearing [-lexer] argument" + "\n\n");
System.err.println("\t[-parser parserName]\n" + "\n\t\t" + "Will attempt to load the parser of name [parserName]" + "\n\t\t"
+ "The loaded parser will be the last appearing [-parser] argument" + "\n\n\t\t" + "Requires a lexer"
+ "\n\n");
System.err.println("\t[-channel enumName]\n" + "\n\t\t" + "Will use supplied [enumName] for channel names in token output"
+ "\n\t\t" + "Without this, it will use channel ids instead of channel names"
+ "\n\n\t\tExample: -channel Channel" + "\n\n\t\t\t" + "public static enum Channel {" + "\n\t\t\t\t"
+ "OUT," + "\n\t\t\t\t" + "WHITESPACE," + "\n\t\t\t\t" + "COMMENTS" + "\n\n\t\t\t\t"
+ "; public final int value = CHANNEL_INDEX++;" + "\n\t\t\t"
+ "} private static int CHANNEL_INDEX = 0;" + "\n\n");
System.err.println("\t[-package packageName]\n" + "\n\t\t" + "Will load grammar from package [packageName]" + "\n\t\t"
+ "Packages may be specifically applied to the parser and lexer as well" + "\n\t\t"
+ "A package declaration will work with specific lexer and parser package definitions"
+ "\n\n\t\t" + "Loads myPackage..otherPackage.subPackage.lexerName"
+ "\n\n\t\t\t" + "-package myPackage.otherPackage -lexer subPackage.lexerName" + "\n\n");
System.err.println("\t[-ps psName]\n" + "\n\t\t" + "generates a visual representation of the parse tree in PostScript and"
+ "\n\t\t" + "stores it in [psName] (should be of type .ps)" + "\n\n");
System.err.println("\t[-encoding encodingName]\n" + "\n\t\t" + "specifies the input file encoding if the current" + "\n\t\t"
+ "locale would not read the input properly. For example, need this option" + "\n\t\t"
+ "to parse a Japanese-encoded XML file" + "\n\n");
System.err.println("\t[-trace]\n" + "\n\t\t" + "prints the rule name and current token upon rule entry and exit" + "\n\n");
System.err.println("\t[-diagnostics]\n" + "\n\t\t" + "turns on diagnostic messages during parsing. This generates messages"
+ "\n\t\t" + "only for unusual situations such as ambiguous input phrases." + "\n\n");
System.err.println("\t[-SLL]\n" + "\n\t\t" + "uses a faster but slightly weaker parsing strategy" + "\n\n");
System.err.println("\t[input-filename]\n" + "\n\t\t" + "Omitting will read from stdin" + "\n\n");
System.exit(1);
}
}
private void evaluateArgs(String args[])
{
evaluateArgs_len0(args);
for (int i = 0; i < args.length; ++i)
{
i = evaluateArgs_0(args, i, 0);
}
}
public TestGrammar(String args[])
{
evaluateArgs(args);
}
private String getLexerName()
{
if (arg_lexer == null)
{
System.err.println("Missing lexer");
System.exit(1);
}
if (arg_package != null)
{
return arg_package + "." + arg_lexer;
}
else
{
return arg_lexer;
}
}
private String getParserName()
{
if (arg_parser == null)
{
System.err.println("Missing parser");
System.exit(1);
}
if (arg_package != null)
{
return arg_package + "." + arg_parser;
}
else
{
return arg_parser;
}
}
private void loadLexer()
{
String lexerName = getLexerName() + "Lexer";
ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
lexerClass = null;
try
{
lexerClass = classLoader.loadClass(lexerName).asSubclass(Lexer.class);
}
catch (java.lang.ClassNotFoundException cnfe)
{
lexerName = arg_lexer;
try
{
lexerClass = classLoader.loadClass(lexerName).asSubclass(Lexer.class);
}
catch (ClassNotFoundException cnfe2)
{
System.err.println("Unable to load " + lexerName + " as lexer or parser (file wasn't found)");
System.exit(1);
}
}
try
{
Constructor<? extends Lexer> lexerCtor = lexerClass.getConstructor(CharStream.class);
lexer = lexerCtor.newInstance((CharStream) null);
}
catch (Exception e)
{
System.exit(1);
}
tokenNames = lexer.getTokenNames();
if (arg_channel != null)
{
Class<?> channel = null;
try
{
channel = Class.forName(lexerClass.getName() + "$" + arg_channel);
}
catch (Exception e)
{
System.err.println("[" + arg_channel + " is not a declared member enum of @members of " + arg_lexer);
System.err.println("Using channel id for -tokens instead of channel names");
}
if (channel != null)
{
if (channel.isEnum())
{
if (Modifier.isStatic(channel.getModifiers()))
{
Object[] enumConstants = channel.getEnumConstants();
if (enumConstants.length != 0)
{
channelNames = new String[enumConstants.length];
for (int i = 0; i < enumConstants.length; ++i)
{
channelNames[i] = enumConstants[i].toString();
}
}
else
{
System.err.println("[" + arg_channel + "] has no declared channels");
System.err.println("Using channel id for -tokens instead of channel names");
}
}
else
{
System.err.println("[" + arg_channel + "] is not a static member of @members of " + arg_lexer);
System.err.println("Using channel id for -tokens instead of channel names");
}
}
else
{
System.err.println("[" + arg_channel + "] is not a member enum of @members of " + arg_lexer);
System.err.println("Using channel id for -tokens instead of channel names");
}
}
}
}
private void loadParser()
{
parserClass = null;
parser = null;
if (arg_parser != null)
{
String parserName = getParserName() + "Parser";
ClassLoader classLoader = Thread.currentThread().getContextClassLoader();
try
{
parserClass = classLoader.loadClass(parserName).asSubclass(Parser.class);
}
catch (Exception e)
{
parserName = arg_parser;
try
{
parserClass = classLoader.loadClass(parserName).asSubclass(Parser.class);
}
catch (ClassNotFoundException cnfe2)
{
System.err.println("Unable to load " + parserName + " as parser (file wasn't found)");
System.exit(1);
}
}
try
{
Constructor<? extends Parser> parserCtor = parserClass.getConstructor(TokenStream.class);
parser = parserCtor.newInstance((TokenStream) null);
}
catch (Exception e)
{
}
}
if (parser != null)
{
parserRule = parser.getRuleNames()[0];
}
}
private void process()
{
loadLexer();
loadParser();
InputStream inputStream;
Reader reader;
if (arg_input.size() == 0)
{
inputStream = System.in;
reader = null;
try
{
if (arg_encoding != null)
{
reader = new InputStreamReader(inputStream, arg_encoding);
}
else
{
reader = new InputStreamReader(inputStream);
}
}
catch (Exception e)
{
}
if (reader != null)
{
process(inputStream, reader);
}
}
else
{
for (String inputFile : arg_input)
{
inputStream = null;
reader = null;
try
{
if (inputFile != null)
{
inputStream = new FileInputStream(inputFile);
}
}
catch (Exception e)
{
System.err.println("Could Not Load File [" + inputFile + "]");
}
if (inputStream != null)
{
try
{
if (arg_encoding != null)
{
reader = new InputStreamReader(inputStream, arg_encoding);
}
else
{
reader = new InputStreamReader(inputStream);
}
}
catch (Exception e)
{
}
if (reader != null)
{
System.out.print(inputFile + " {\n");
process(inputStream, reader);
System.out.print("}\n");
}
}
}
}
}
private void process(InputStream inputStream, Reader reader)
{
try
{
lexer.setInputStream(new ANTLRInputStream(reader));
tokens = new CommonTokenStream(lexer);
// tokens = new UnbufferedTokenStream(lexer);
if (parser != null)
{
if (arg_diagnostics)
{
parser.addErrorListener(new DiagnosticErrorListener());
parser.getInterpreter().setPredictionMode(PredictionMode.LL_EXACT_AMBIG_DETECTION);
}
if (arg_tree || arg_gui || arg_ps != null)
{
parser.setBuildParseTree(true);
}
if (arg_sll)
{
parser.getInterpreter().setPredictionMode(PredictionMode.SLL);
}
parser.setTokenStream(tokens);
parser.setTrace(arg_trace);
if (arg_tree || arg_gui || arg_ps != null)
{
try
{
Method startRule = parserClass.getMethod(parserRule);
ParserRuleContext tree = (ParserRuleContext) startRule.invoke(parser, (Object[]) null);
if (arg_tree)
{
System.out.println("\tTree {\n\t\t" + tree.toStringTree(parser) + "\n\t}");
}
if (arg_gui)
{
tree.inspect(parser);
}
if (arg_ps != null)
{
try
{
tree.save(parser, arg_ps);
}
catch (Exception e)
{
System.out.println("Could not save postscript [" + arg_ps + "]");
}
}
}
catch (Exception e)
{
System.err.println("Parser has invalid start rule [" + parserRule + "]");
}
}
}
else
{
tokens.fill();
}
printTokens("\t");
}
catch (Exception e)
{
}
finally
{
try
{
if (reader != null)
{
reader.close();
}
if (inputStream != null)
{
inputStream.close();
}
}
catch (Exception e)
{
}
}
}
public static void main(String args[])
{
TestGrammar tester = new TestGrammar(args);
tester.process();
}
}
edit
Ah, forgot this tiny snippet.
Java:
package compile.antlr;
enum ValueType
{
INVALID, STRING, INTEGER, DOUBLE, BOOLEAN
}
Example of use using test harness
Code:
lexer grammar input;
#<if ((hello = 3) == 3 || var = hello)>
R: 'a';
#<end>
#<if (false)>
B: 'c';
#<end>
//#<if (expr)> line line line #<end>
//#<arguments> arg arg arg #<end>
//#<import(file, args...)>
//#<$var>
#<import ("grammar/input2.g4" "name1" "value2")>
Code:
#<arguments>
ruleName //first argument
ruleValue //second argument
#<end>
A#<$ruleName>: #<$ruleValue>;
Output with tokens (I still have a few debug messages in there)
Code:
[Channel is not a declared member enum of @members of Antlr
Using channel id for -tokens instead of channel names
grammar/input.g4 {
Value: true == (hello=3)==3||var=hello
Value: false == false
Tokens {
Type Value Chan
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SCRIPT |lexer grammar input;\r\n\r\n| 0
SCRIPT |\r\nR: 'a';\r\n| 0
SCRIPT |\r\n| 0
SCRIPT |\r\n\r\n//#<if (expr)> line line line #<end>\r\n//#<arguments> arg arg arg #<end>\r\n//#<import(file, args...)>\r\n//#<$var>\r\n\r\n| 0
SCRIPT |\r\n\r\nA| 0
SCRIPT |name1| 0
SCRIPT |: | 0
SCRIPT |value2| 0
SCRIPT |;\r\n\r\n//arguments\r\n/*\r\n#<arguments>\r\n\tversion\t\t//first argument\r\n\tcount\t\t//second argument\r\n\t123what\t\t//third argument\r\n#<end>\r\n\r\n#<if (count > "0")>\r\n\t//Test#<$version>:\t\t#<$123what>;\r\n\t#<import("grammar\\input3.g4", count)>\r\n\t#<if (count > "7")>\r\n\t\t//Test#<$version>:\t\t#<$123what>;\r\n\t\t#<import("grammar\\input3.g4", 123what)>\r\n\t#<end>\r\n\t#<import("grammar\\input3.g4", count)>\r\n#<end>\r\n#<if (count > "1")>\r\n\tTest#<$version>:\t\t#<$123what>\r\n\tTest#<$version>:\t\t#<$123what>\r\n\tTest#<$version>:\t\t#<$123what>\r\n\tTest#<$version>:\t\t#<$123what>\r\n\tTest#<$version>:\t\t#<$123what>\r\n\tTest#<$version>:\t\t#<$123what>\r\n\t#<import("grammar\\input3.g4", 123what)>\r\n\t#<import("grammar\\input3.g4", 123what)>\r\n#<end>\r\n\r\n#<import("grammar\\input3.g4", 123what)>\r\n*/| 0
SCRIPT |\r\n\r\n/*\r\n#<arguments>\r\n\thello\t\t//first argument\r\n\tboo\t\t\t//second argument\r\n\twhat\t\t//third argument\r\n#<end>\r\n\r\nR: 'a';\r\n\r\n#<if ("4" == "4")>\r\n\tR[[a]];\r\n\t#<if ("4" == "4")>\r\n\t\tR[[c]];\r\n\t#<end>\r\n\t#<if ("4" == "4")>\r\n\t\tR[[b]];\r\n\t#<end>\r\n#<end>\r\n\r\n#<if ("4" == "4")>\r\n\tR[[a]];\r\n\t#<if ("4" == "4")>\r\n\t\tR[[c]];\r\n\t#<end>\r\n\t#<if ("4" == "4")>\r\n\t\tR[[b]];\r\n\t#<end>\r\n#<end>\r\n\r\n#<import("grammar\\input2.g4", "v5", "3", "what?")>\r\n\r\n#<if ("4" == "4")>\r\n\tR[[a]];\r\n#<end>\r\n\r\nR: 'b';\r\n*/\r\n\r\n\r\n\r\n| 0
EOF |<EOF>| 0
}
}
Last edited: