antlr4 python 3从plsql语法打印或转储令牌 [英] antlr4 python 3 print or dump tokens from plsql grammar
问题描述
我正在Python中使用antlr4来阅读以下语法:
I am using antlr4 in Python, to read the following grammar :
https://github.com/antlr/grammars-v4/tree /master/plsql
grants.sql文件仅具有从double开始选择'bob';结束;"
file grants.sql just has "begin select 'bob' from dual; end;"
简单代码可打印像树一样的lisp
simple code to print lisp like tree
from antlr4 import *
from PlSqlLexer import PlSqlLexer
from PlSqlParser import PlSqlParser
from PlSqlParserListener import PlSqlParserListener
input = FileStream('grants.sql')
lexer = PlSqlLexer(input)
stream = CommonTokenStream(lexer)
parser = PlSqlParser(stream)
tree = parser.sql_script()
print ("Tree " + tree.toStringTree(recog=parser));
输出因此是:
树(sql_script(unit_statement(anonymous_block BEGIN(seq_of_statements(statement(sql_statement(data_manipulation_language_statements(select_statement(subquery(subquery(subquery_subquery_basic_elements) (atom(常量(quoted_string'bob'))))))))))))))))))))))))))))))from_clause FROM(table_ref_list(table_ref(table_ref_aux(table_ref_aux_internal(dml_table_expression_clause(tableview_name(identifier(id_expression(regular_id DUAL)))))))))))) ))))))))))))))));)))
Tree (sql_script (unit_statement (anonymous_block BEGIN (seq_of_statements (statement (sql_statement (data_manipulation_language_statements (select_statement (subquery (subquery_basic_elements (query_block SELECT (selected_element (select_list_elements (expressions (expression (logical_expression (multiset_expression (relational_expression (compound_expression (concatenation (model_expression (unary_expression (atom (constant (quoted_string 'bob')))))))))))))) (from_clause FROM (table_ref_list (table_ref (table_ref_aux (table_ref_aux_internal (dml_table_expression_clause (tableview_name (identifier (id_expression (regular_id DUAL))))))))))))))))) ;) END ;)) )
我希望能够使用python代码,而不是像lisp之类的语句中列出上述内容,而是列出所有规则和令牌.即
I'd like to be able to have python code that lists the above not in a lisp like statement but lists all the rules and tokens.. i.e
- .sql_script
- .sql_script
- .. unit_statement
- ... anonymous_block
- ....开始
等等等
有人可以提供执行此操作的python代码还是给我一些提示.非常感谢.
Can someone supply python code that does this or give me some hints. Gratefully appreciated.
推荐答案
这是一个开始:
from antlr4 import *
from antlr4.tree.Tree import TerminalNodeImpl
from PlSqlLexer import PlSqlLexer
from PlSqlParser import PlSqlParser
# Generate the lexer nad parser like this:
#
# java -jar antlr-4.7.1-complete.jar -Dlanguage=Python3 *.g4
#
def main():
lexer = PlSqlLexer(InputStream("SELECT * FROM TABLE_NAME"))
parser = PlSqlParser(CommonTokenStream(lexer))
tree = parser.sql_script()
traverse(tree, parser.ruleNames)
def traverse(tree, rule_names, indent = 0):
if tree.getText() == "<EOF>":
return
elif isinstance(tree, TerminalNodeImpl):
print("{0}TOKEN='{1}'".format(" " * indent, tree.getText()))
else:
print("{0}{1}".format(" " * indent, rule_names[tree.getRuleIndex()]))
for child in tree.children:
traverse(child, rule_names, indent + 1)
if __name__ == '__main__':
main()
打印:
sql_script
unit_statement
data_manipulation_language_statements
select_statement
subquery
subquery_basic_elements
query_block
TOKEN='SELECT'
TOKEN='*'
from_clause
TOKEN='FROM'
table_ref_list
table_ref
table_ref_aux
table_ref_aux_internal
dml_table_expression_clause
tableview_name
identifier
id_expression
regular_id
TOKEN='TABLE_NAME'
请注意,为了使词法分析器和解析器正常工作,我添加了以下Python类:
Note that for the lexer and parser to work properly, I added the following Python classes:
# PlSqlBaseLexer.py
from antlr4 import *
class PlSqlBaseLexer(Lexer):
def IsNewlineAtPos(self, pos):
la = self._input.LA(pos)
return la == -1 or la == '\n'
和:
# PlSqlBaseParser.py
from antlr4 import *
class PlSqlBaseParser(Parser):
_isVersion10 = False
_isVersion12 = True
def isVersion10(self):
return self._isVersion10
def isVersion12(self):
return self._isVersion12
def setVersion10(self, value):
self._isVersion10 = value
def setVersion12(self, value):
self._isVersion12 = value
我将其放置在与生成的Python类相同的文件夹中.我还需要在生成的PlSqlLexer.py
类中添加导入语句from PlSqlBaseLexer import PlSqlBaseLexer
,并将PlSqlParser.py
中的导入语句从from ./PlSqlBaseParser import PlSqlBaseParser
修复为from PlSqlBaseParser import PlSqlBaseParser
.
which I placed in the same folder as the generated Python classes. I also needed to and the import statement from PlSqlBaseLexer import PlSqlBaseLexer
in the generated PlSqlLexer.py
class, and fix the import statement in PlSqlParser.py
from from ./PlSqlBaseParser import PlSqlBaseParser
to from PlSqlBaseParser import PlSqlBaseParser
.
请注意,运行演示相当慢.除非您有使用Python的硬性要求,否则我建议改用速度更快的Java或C#目标.
Note that running the demo is rather slow. Unless you have a hard requirement to do this in Python, I recommend going with the (much!) faster Java or C# target instead.
这篇关于antlr4 python 3从plsql语法打印或转储令牌的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!