'self.visitChildren(ctx) in ANTLR visitor root node returns None in Python
As the title says: when propagating values from the parsing tree the root node returns None when I call the self.visitChildren(ctx) there. I can see that the other nodes propagate the values upside but the root node is the only that is receiving None.
I am using ANTLR 4.10.1 and Python antlr4-python3-runtime 4.10.
It prints the following when I entered 393939393 and CTRL-D:
Number: 393939393
Atom: {'type': 'number', 'value': '393939393'}
SExpr: None
None
I tried with the follow slight modification of the S-Expression parser and next I show my visitor script:
/*
Port to Antlr4 by Tom Everett
*/
grammar sexpr;
sexpr
/* : item* EOF */
: item EOF
;
item
: atom
| list_
/* | LPAREN item DOT item RPAREN */
;
list_
: LPAREN item* RPAREN
;
atom
: string
| symbol
| number
/* | DOT */
;
string: STRING ;
symbol: SYMBOL ;
number: NUMBER ;
STRING
: '"' ('\\' . | ~ ('\\' | '"'))* '"'
;
WHITESPACE
: (' ' | '\n' | '\t' | '\r')+ -> skip
;
NUMBER
: ('+' | '-')? (DIGIT)+ ('.' (DIGIT)+)?
;
SYMBOL
: SYMBOL_START (SYMBOL_START | DIGIT)*
;
LPAREN
: '('
;
RPAREN
: ')'
;
DOT
: '.'
;
fragment SYMBOL_START
: ('a' .. 'z')
| ('A' .. 'Z')
| '+'
| '-'
| '*'
| '/'
| '.'
;
fragment DIGIT
: ('0' .. '9')
;
Python 3 ANTLR visitor code
#!/usr/bin/python3
import sys
from antlr4 import *
from sexprLexer import sexprLexer
from sexprParser import sexprParser
from sexprVisitor import sexprVisitor
class SExprVisitor(sexprVisitor):
def visitSexpr(self, ctx):
r = self.visitChildren(ctx)
print("SExpr: %s" % r)
return r
def visitItem(self, ctx):
r = self.visitChildren(ctx)
return r
def visitAtom(self, ctx):
r = self.visitChildren(ctx)
print("Atom: %s" % r)
return r
def visitString(self, ctx):
print("String: %s" % ctx.getText())
return {'type':'string', 'value':ctx.getText()}
def visitNumber(self, ctx):
print("Number: %s" % ctx.getText())
return {'type':'number', 'value':ctx.getText()}
def visitSymbol(self, ctx):
print("Symbol: %s" % ctx.getText())
return {'type':'symbol', 'value':ctx.getText()}
def visitor_main(argv):
input_stream = StdinStream()
lexer = sexprLexer(input_stream)
stream = CommonTokenStream(lexer)
parser = sexprParser(stream)
tree = parser.sexpr()
visitor = SExprVisitor()
output = visitor.visit(tree)
print(output)
def main(argv):
visitor_main(argv)
if __name__ == '__main__':
main(sys.argv)
Solution 1:[1]
The root node ends with an EOF token, which is what is being returned by your visitor after calling visitChildren. If you include this method in your visitor:
def visitTerminal(self, ctx):
# The `EOF` will now return this instead of `None`
return '???'
you'll see ??? being returned by visitSexpr.
To fix it, just invoke the item in your visitSexpr:
def visitSexpr(self, ctx):
r = self.visitChildren(ctx.item())
print("SExpr: %s" % r)
return r
And you can make your grammar a bit more compact by using [...] instead of the old v3 syntax '?' .. '?':
grammar sexpr;
sexpr
/* : item* EOF */
: item EOF
;
item
: atom
| list_
/* | LPAREN item DOT item RPAREN */
;
list_
: LPAREN item* RPAREN
;
atom
: string
| symbol
| number
/* | DOT */
;
string: STRING ;
symbol: SYMBOL ;
number: NUMBER ;
STRING
: '"' ('\\' . | ~[\\"])* '"'
;
WHITESPACE
: [ \t\r\n]+ -> skip
;
NUMBER
:[+\-]? (DIGIT)+ ('.' (DIGIT)+)?
;
SYMBOL
: SYMBOL_START (SYMBOL_START | DIGIT)*
;
LPAREN
: '('
;
RPAREN
: ')'
;
DOT
: '.'
;
fragment SYMBOL_START
: [a-zA-Z+\-*/.]
;
fragment DIGIT
: [0-9]
;
EDIT
I didn't understand your comment about the compact grammar in [...]
I mean that the older v3 syntax 'a' .. 'z' can be written in v4 as [a-z], making it more compact.
If I use item* instead of item what is the best way to traverse the items? ...
What you added in your comment will probably work. You could change the grammar slightly to create a items : item*; rule that is used by the other rules. You could also use alternative labels so that you don't need the extra rules like string, symbol and number.
A quick demo:
/*
Port to Antlr4 by Tom Everett
*/
grammar sexpr;
sexpr
: items EOF
;
items
: item*
;
item
: atom #item_atom
| list_ #item_list
;
list_
: LPAREN items RPAREN
;
atom
: STRING #atom_string
| SYMBOL #atom_symbol
| NUMBER #atom_number
;
STRING
: '"' ('\\' . | ~[\\"])* '"'
;
WHITESPACE
: [ \t\r\n]+ -> skip
;
NUMBER
:[+\-]? (DIGIT)+ ('.' (DIGIT)+)?
;
SYMBOL
: SYMBOL_START (SYMBOL_START | DIGIT)*
;
LPAREN
: '('
;
RPAREN
: ')'
;
DOT
: '.'
;
fragment SYMBOL_START
: [a-zA-Z+\-*/.]
;
fragment DIGIT
: [0-9]
;
and if you now run:
import sys
from antlr4 import *
from sexprLexer import sexprLexer
from sexprParser import sexprParser
from sexprVisitor import sexprVisitor
class SExprVisitor(sexprVisitor):
def visitSexpr(self, ctx):
return self.visit(ctx.items())
def visitItems(self, ctx):
items = []
for item in ctx.item():
items.append(self.visit(item))
return items
def visitItem_atom(self, ctx):
return self.visit(ctx.atom())
def visitItem_list(self, ctx):
return self.visit(ctx.list_())
def visitList_(self, ctx):
return self.visit(ctx.items())
def visitAtom_string(self, ctx):
return {'type': 'string', 'value': ctx.getText()}
def visitAtom_number(self, ctx):
return {'type': 'number', 'value': ctx.getText()}
def visitAtom_symbol(self, ctx):
return {'type': 'symbol', 'value': ctx.getText()}
def visitor_main(argv):
lexer = sexprLexer(InputStream('("Q" 42 /7)'))
stream = CommonTokenStream(lexer)
parser = sexprParser(stream)
tree = parser.sexpr()
visitor = SExprVisitor()
output = visitor.visit(tree)
print(output)
def main(argv):
visitor_main(argv)
if __name__ == '__main__':
main(sys.argv)
the following is printed:
[[{'type': 'string', 'value': '"Q"'}, {'type': 'number', 'value': '42'}, {'type': 'symbol', 'value': '/7'}]]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
