使用Python解析SQL [英] Parsing SQL with Python
问题描述
我想在非关系数据存储的顶部创建一个SQL接口.非关系数据存储,但是以关系方式访问数据很有意义.
I want to create a SQL interface on top of a non-relational data store. Non-relational data store, but it makes sense to access the data in a relational manner.
我正在研究使用 ANTLR 来生成将SQL表示为关系代数表达式的AST.然后通过评估/遍历树来返回数据.
I am looking into using ANTLR to produce an AST that represents the SQL as a relational algebra expression. Then return data by evaluating/walking the tree.
我以前从未实现过解析器,因此我想就如何最好地实现SQL解析器和评估器提供一些建议.
I have never implemented a parser before, and I would therefore like some advice on how to best implement a SQL parser and evaluator.
- Does the approach described above sound about right?
- Are there other tools/libraries I should look into? Like PLY or Pyparsing.
- Pointers to articles, books or source code that will help me is appreciated.
更新:
我使用pyparsing实现了一个简单的SQL解析器.结合对我的数据存储区执行关系操作的Python代码,这非常简单.
I implemented a simple SQL parser using pyparsing. Combined with Python code that implement the relational operations against my data store, this was fairly simple.
正如我在评论中提到的那样,这次练习的重点是使数据可用于报告引擎.为此,我可能需要实现一个ODBC驱动程序.这可能是很多工作.
As I said in one of the comments, the point of the exercise was to make the data available to reporting engines. To do this, I probably will need to implement an ODBC driver. This is probably a lot of work.
推荐答案
我已经广泛研究了这个问题. Python-sqlparse是一个非验证解析器,实际上并不是您所需要的. antlr中的示例需要大量工作才能在python中转换为不错的ast. sql标准语法为此处,但是自行转换它们将是一项全职工作,可能您只需要它们的一个子集,即没有联接.您也可以尝试查看 gadfly (一个python sql数据库),但是由于他们使用了自己的解析工具.
I have looked into this issue quite extensively. Python-sqlparse is a non validating parser which is not really what you need. The examples in antlr need a lot of work to convert to a nice ast in python. The sql standard grammers are here, but it would be a full time job to convert them yourself and it is likely that you would only need a subset of them i.e no joins. You could try looking at the gadfly (a python sql database) as well, but I avoided it as they used their own parsing tool.
就我而言,我基本上只需要一个where子句.我尝试使用pyparsing编写的 booleneo (布尔表达式解析器),但最终还是从头开始使用pyparsing. Mark Rushakoff的reddit帖子中的第一个链接提供了一个使用它的SQL示例. "Whosh" 全文搜索引擎也使用它,但我没有查看源代码以了解操作方法.
For my case, I only essentially needed a where clause. I tried booleneo (a boolean expression parser) written with pyparsing but ended up using pyparsing from scratch. The first link in the reddit post of Mark Rushakoff gives a sql example using it. Whoosh a full text search engine also uses it but I have not looked at the source to see how.
Pyparsing非常易于使用,并且您可以非常容易地对其进行自定义,使其与sql不完全相同(大多数语法都不需要).我不喜欢ply,因为它使用命名约定使用了一些魔术.
Pyparsing is very easy to use and you can very easily customize it to not be exactly the same as sql (most of the syntax you will not need). I did not like ply as it uses some magic using naming conventions.
简而言之,请尝试一下pyparsing,它很可能足够强大来执行您需要的操作,并且与python的简单集成(具有轻松的回调和错误处理)将使体验变得非常轻松.
In short give pyparsing a try, it will most likely be powerful enough to do what you need and the simple integration with python (with easy callbacks and error handling) will make the experience pretty painless.
这篇关于使用Python解析SQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!