Pony (ORM) 的技巧是什么? [英] How Pony (ORM) does its tricks?

查看:30
本文介绍了Pony (ORM) 的技巧是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Pony ORM 在将生成器表达式转换为 SQL 方面做得很好.示例:

<预><代码>>>>select(p for p in Person if p.name.startswith('Paul')).order_by(Person.name)[:2]选择p".id"、p".姓名"、p".年龄"来自人"p"WHERE p".name"喜欢保罗%"ORDER BY p".name"限制 2[人[3],人[1]]>>>

我知道 Python 内置了很棒的内省和元编程,但是这个库如何能够在没有预处理的情况下翻译生成器表达式?看起来很神奇.

[更新]

Blender 写道:

<块引用>

这里是你的文件之后.它似乎使用一些内省魔法来重建生成器.我不确定它是否支持 100% 的 Python 语法,但这很酷.– 搅拌机

我以为他们正在探索生成器表达式协议中的某些功能,但是查看此文件,并看到涉及的 ast 模块......不,他们没有即时检查程序源代码,是吗?令人兴奋...

@BrenBarn:如果我尝试在 select 函数调用之外调用生成器,结果是:

<预><代码>>>>x = (p for p in Person if p.age > 20)>>>x.next()回溯(最近一次调用最后一次):文件<interactive input>",第 1 行,在 <module>文件<interactive input>",第 1 行,在 <genexpr> 中.文件C:Python27libsite-packagesponyormcore.py",第 1822 行,在下一个% self.entity.__name__)文件C:Python27libsite-packagesponyutils.py",第 92 行,抛出提高 excTypeError:使用 select(...) 函数或 Person.select(...) 方法进行迭代>>>

似乎他们正在做更多神秘的咒语,例如检查 select 函数调用和动态处理 Python 抽象语法语法树.

我仍然希望看到有人解释它,来源远远超出了我的巫术水平.

解决方案

Pony ORM 作者在这里.

Pony 分三步将 Python 生成器翻译成 SQL 查询:

  1. 反编译生成器字节码并重建生成器AST(抽象语法树)
  2. 将 Python AST 翻译成抽象 SQL"——通用SQL 查询的基于列表的表示
  3. 将抽象的 SQL 表示转换为具体的依赖于数据库的 SQL 方言

最复杂的部分是第二步,Pony必须在这里理解 Python 表达式的含义".看来你最对第一步感兴趣,所以让我解释一下反编译是如何工作的.

让我们考虑这个查询:

<预><代码>>>>从 pony.orm.examples.estore 导入 *>>>select(c for c in Customer if c.country == 'USA').show()

将被翻译成以下 SQL:

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c".地址"来自客户"c"WHERE "c"."country" = '美国'

下面是这个查询的结果,将被打印出来:

id|email |password|name |country|address--+-----------+--------+--------------+-------+------------1 |john@example.com |*** |John Smith |美国 |地址 12 |matthew@example.com|*** |Matthew Reed |美国 |地址 24 |rebecca@example.com|*** |Rebecca Lawson|美国 |地址 4

select() 函数接受一个 python 生成器作为参数,然后分析它的字节码.我们可以使用标准的 python dis 模块获取这个生成器的字节码指令:

<预><代码>>>>gen = (c for c in Customer if c.country == 'USA')>>>导入文件>>>dis.dis(gen.gi_frame.f_code)1 0 LOAD_FAST 0 (.0)>>3 FOR_ITER 26(到 32)6 STORE_FAST 1 (c)9 LOAD_FAST 1 (c)12 LOAD_ATTR 0 (国家)15 LOAD_CONST 0 ('美国')18 COMPARE_OP 2 (==)21 POP_JUMP_IF_FALSE 324 LOAD_FAST 1 (c)27 YIELD_VALUE28 POP_TOP29 JUMP_ABSOLUTE 3>>32 LOAD_CONST 1(无)35 RETURN_VALUE

Pony ORM 在 pony.orm.decompiling 模块中具有 decompile() 函数,它可以从字节码中恢复 AST:

<预><代码>>>>from pony.orm.decompiling 导入反编译>>>ast,external_names = 反编译(gen)

在这里,我们可以看到 AST 节点的文本表示:

<预><代码>>>>ASTGenExpr(GenExprInner(Name('c'), [GenExprFor(AssName('c', 'OP_ASSIGN'), Name('.0'),[GenExprIf(Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]))])]))

现在让我们看看 decompile() 函数是如何工作的.

decompile() 函数创建一个 Decompiler 对象,它实现了访问者模式.反编译器实例一一获取字节码指令.对于每条指令,反编译器对象调用它自己的方法.该方法的名称等于当前字节码指令的名称.

Python 在计算表达式时,使用栈,栈中存储了一个中间值计算结果.反编译器对象也有自己的堆栈,但是这个栈不是存储表达式计算的结果,但是表达式的 AST 节点.

当调用下一条字节码指令的反编译方法时,它从堆栈中获取 AST 节点,将它们组合起来放入一个新的AST节点,然后把这个节点放到栈顶.

例如,让我们看看子表达式 c.country == 'USA' 是如何计算的.这对应的字节码片段为:

 9 LOAD_FAST 1 (c)12 LOAD_ATTR 0 (国家)15 LOAD_CONST 0 ('美国')18 COMPARE_OP 2 (==)

因此,反编译器对象执行以下操作:

  1. 调用 decompiler.LOAD_FAST('c').此方法将 Name('c') 节点放在反编译器堆栈的顶部.
  2. 调用 decompiler.LOAD_ATTR('country').此方法从堆栈中获取 Name('c') 节点,创建 Geattr(Name('c'), 'country') 节点并将其放在堆栈顶部.
  3. 调用 decompiler.LOAD_CONST('USA').此方法将 Const('USA') 节点放在堆栈顶部.
  4. 调用 decompiler.COMPARE_OP('==').此方法从堆栈中获取两个节点(Getattr 和 Const),然后输入 Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))])在栈顶.

处理完所有字节码指令后,反编译堆栈包含对应于整个生成器表达式的单个 AST 节点.

由于 Pony ORM 需要反编译生成器和仅 lambdas,这并不复杂,因为生成器的指令流相对简单- 它只是一堆嵌套循环.

目前 Pony ORM 涵盖了整个生成器指令集,除了两件事:

  1. 内联 if 表达式:a if b else c
  2. 复合比较:a <<c

如果 Pony 遇到这样的表达式,它会引发 NotImplementedError 异常.但即使在在这种情况下,您可以通过将生成器表达式作为字符串传递来使其工作.当您将生成器作为字符串传递时,Pony 不使用反编译器模块.反而它使用标准 Python compiler.parse 函数获取 AST.

希望这能回答您的问题.

Pony ORM does the nice trick of converting a generator expression into SQL. Example:

>>> select(p for p in Person if p.name.startswith('Paul'))
        .order_by(Person.name)[:2]

SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."name" LIKE "Paul%"
ORDER BY "p"."name"
LIMIT 2

[Person[3], Person[1]]
>>>

I know Python has wonderful introspection and metaprogramming builtin, but how this library is able to translate the generator expression without preprocessing? It looks like magic.

[update]

Blender wrote:

Here is the file that you're after. It seems to reconstruct the generator using some introspection wizardry. I'm not sure if it supports 100% of Python's syntax, but this is pretty cool. – Blender

I was thinking they were exploring some feature from the generator expression protocol, but looking this file, and seeing the ast module involved... No, they are not inspecting the program source on the fly, are they? Mind-blowing...

@BrenBarn: If I try to call the generator outside the select function call, the result is:

>>> x = (p for p in Person if p.age > 20)
>>> x.next()
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "<interactive input>", line 1, in <genexpr>
  File "C:Python27libsite-packagesponyormcore.py", line 1822, in next
    % self.entity.__name__)
  File "C:Python27libsite-packagesponyutils.py", line 92, in throw
    raise exc
TypeError: Use select(...) function or Person.select(...) method for iteration
>>>

Seems like they are doing more arcane incantations like inspecting the select function call and processing the Python abstract syntax grammar tree on the fly.

I still would like to see someone explaining it, the source is way beyond my wizardry level.

解决方案

Pony ORM author is here.

Pony translates Python generator into SQL query in three steps:

  1. Decompiling of generator bytecode and rebuilding generator AST (abstract syntax tree)
  2. Translation of Python AST into "abstract SQL" -- universal list-based representation of a SQL query
  3. Converting abstract SQL representation into specific database-dependent SQL dialect

The most complex part is the second step, where Pony must understand the "meaning" of Python expressions. Seems you are most interested in the first step, so let me explain how decompiling works.

Let's consider this query:

>>> from pony.orm.examples.estore import *
>>> select(c for c in Customer if c.country == 'USA').show()

Which will be translated into the following SQL:

SELECT "c"."id", "c"."email", "c"."password", "c"."name", "c"."country", "c"."address"
FROM "Customer" "c"
WHERE "c"."country" = 'USA'

And below is the result of this query which will be printed out:

id|email              |password|name          |country|address  
--+-------------------+--------+--------------+-------+---------
1 |john@example.com   |***     |John Smith    |USA    |address 1
2 |matthew@example.com|***     |Matthew Reed  |USA    |address 2
4 |rebecca@example.com|***     |Rebecca Lawson|USA    |address 4

The select() function accepts a python generator as argument, and then analyzes its bytecode. We can get bytecode instructions of this generator using standard python dis module:

>>> gen = (c for c in Customer if c.country == 'USA')
>>> import dis
>>> dis.dis(gen.gi_frame.f_code)
  1           0 LOAD_FAST                0 (.0)
        >>    3 FOR_ITER                26 (to 32)
              6 STORE_FAST               1 (c)
              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)
             21 POP_JUMP_IF_FALSE        3
             24 LOAD_FAST                1 (c)
             27 YIELD_VALUE         
             28 POP_TOP             
             29 JUMP_ABSOLUTE            3
        >>   32 LOAD_CONST               1 (None)
             35 RETURN_VALUE

Pony ORM has the function decompile() within module pony.orm.decompiling which can restore an AST from the bytecode:

>>> from pony.orm.decompiling import decompile
>>> ast, external_names = decompile(gen)

Here, we can see the textual representation of the AST nodes:

>>> ast
GenExpr(GenExprInner(Name('c'), [GenExprFor(AssName('c', 'OP_ASSIGN'), Name('.0'),
[GenExprIf(Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]))])]))

Let's now see how the decompile() function works.

The decompile() function creates a Decompiler object, which implements the Visitor pattern. The decompiler instance gets bytecode instructions one-by-one. For each instruction the decompiler object calls its own method. The name of this method is equal to the name of current bytecode instruction.

When Python calculates an expression, it uses stack, which stores an intermediate result of calculation. The decompiler object also has its own stack, but this stack stores not the result of expression calculation, but AST node for the expression.

When decompiler method for the next bytecode instruction is called, it takes AST nodes from the stack, combines them into a new AST node, and then puts this node on the top of the stack.

For example, let's see how the subexpression c.country == 'USA' is calculated. The corresponding bytecode fragment is:

              9 LOAD_FAST                1 (c)
             12 LOAD_ATTR                0 (country)
             15 LOAD_CONST               0 ('USA')
             18 COMPARE_OP               2 (==)

So, the decompiler object does the following:

  1. Calls decompiler.LOAD_FAST('c'). This method puts the Name('c') node on the top of the decompiler stack.
  2. Calls decompiler.LOAD_ATTR('country'). This method takes the Name('c') node from the stack, creates the Geattr(Name('c'), 'country') node and puts it on the top of the stack.
  3. Calls decompiler.LOAD_CONST('USA'). This method puts the Const('USA') node on top of the stack.
  4. Calls decompiler.COMPARE_OP('=='). This method takes two nodes (Getattr and Const) from the stack, and then puts Compare(Getattr(Name('c'), 'country'), [('==', Const('USA'))]) on the top of the stack.

After all bytecode instructions are processed, the decompiler stack contains a single AST node which corresponds to the whole generator expression.

Since Pony ORM needs to decompile generators and lambdas only, this is not that complex, because the instruction flow for a generator is relatively straightforward - it is just a bunch of nested loops.

Currently Pony ORM covers the whole generator instructions set except two things:

  1. Inline if expressions: a if b else c
  2. Compound comparisons: a < b < c

If Pony encounters such expression it raises the NotImplementedError exception. But even in this case you can make it work by passing the generator expression as a string. When you pass a generator as a string Pony doesn't use the decompiler module. Instead it gets the AST using the standard Python compiler.parse function.

Hope this answers your question.

这篇关于Pony (ORM) 的技巧是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆