Python中的准备语句和参数化查询之间的混淆 [英] Confusion between prepared statement and parameterized query in Python

查看:583
本文介绍了Python中的准备语句和参数化查询之间的混淆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,准备语句(主要是)一个数据库功能,允许您从使用这些参数的代码中分离参数。示例:

  PREPARE fooplan(int,text,bool,numeric)AS 
INSERT INTO foo VALUES($ 1,$ 2 ,$ 3,$ 4);
EXECUTE fooplan(1,'Hunter Valley','t',200.00);

参数化查询替代手动字符串插值,因此不执行

  cursor.execute(SELECT FROM tablename WHERE fieldname =%s%value)

我们可以做

  cursor.execute(SELECT FROM tablename WHERE fieldname = %s,[value])

现在,似乎准备的语句,用于数据库语言和参数化查询主要用于连接到数据库的编程语言,尽管我已经看到了这个规则的例外。



问题是,关于准备语句和参数化查询之间的差异带来了很多困惑。他们的目的是公认的,但他们的方法似乎截然不同。但是,来源表示两者都是一样的。 MySQLdb和Psycopg2似乎支持参数化查询,但不支持预准备语句(例如,此处适用于MySQLdb和 Postgres驱动程序的TODO列表在sqlalchemy组中的此回答)。实际上,有一个 gist 实现了一个支持预准备语句的psycopg2游标,并且最小的推广。还有一个建议对游标对象进行子类化在psycopg2中手动提供准备的语句。​​



我想获得以下问题的权威答案:




  • 准备语句和参数化查询之间有意义的区别吗?这在实践中是否重要?如果您使用参数化查询,您是否需要担心预准备的语句?


  • 如果有差异,Python语句的当前状态是什么生态系统?哪些数据库适配器支持预准备的语句?



解决方案


  • 准备语句:对数据库上的预解释查询例程的引用,准备接受参数


  • 参数化查询:代码,以便在 中传递一些值,这些值具有占位符值,通常为%s




这种混淆似乎来自于)缺乏区分能力直接获得一个准备语句对象和能力传递值到'参数化查询'方法,行为非常像一个...因为它是一个,或至少为你。 / p>

例如:SQLite3库的C接口有很多工具可以使用准备的语句对象,但 Python api 几乎没有提到它们。你不能准备语句,并且每当你想要多次使用它。相反,您可以使用 sqlite3.executemany(sql,params),它接受SQL代码,在内部创建一个预准备语句,然后使用该语句



Python中的许多其他SQL库的行为都是一样的。使用预准备的语句对象可能是一个真正的痛苦,并可能导致歧义,在像Python这样的语言清晰度和轻松原始执行速度的语言,它们不是真正最好的选择。基本上,如果你发现自己必须对一个复杂的SQL查询进行成千上万或者上百万的调用,每次都要重新解释,你应该可能做不同的事情。无论如何,有时人们希望他们可以直接访问这些对象,因为如果你在数据库服务器上保留相同的预处理语句,就不必一再解释相同的SQL代码;大多数时候,这将从错误的方向接近问题,你将在其他地方或通过重构你的代码得到更大的节省。*



也许更重要的是准备语句和参数化查询保持您的数据的卫生和独立于您的SQL代码的方式。 这非常适合字符串格式!您应该将参数化查询和预准备语句以一种形式或另一种形式考虑为将应用程序中的变量数据传递到数据库强>。如果您尝试生成SQL语句,否则,它不仅会运行慢得多,但您会容易其他问题



*例如,通过生成要在生成器函数中输入到DB中的数据,然后使用 executemany 从生成器中一次性插入,而不是每次循环调用 execute()



tl; dr



参数化查询是一种单一操作,在内部生成预准备语句,然后传递参数并执行。


As far as I understand, prepared statements are (mainly) a database feature that allows you to separate parameters from the code that uses such parameters. Example:

PREPARE fooplan (int, text, bool, numeric) AS
    INSERT INTO foo VALUES($1, $2, $3, $4);
EXECUTE fooplan(1, 'Hunter Valley', 't', 200.00);

A parameterized query substitutes the manual string interpolation, so instead of doing

cursor.execute("SELECT FROM tablename WHERE fieldname = %s" % value)

we can do

cursor.execute("SELECT FROM tablename WHERE fieldname = %s", [value])

Now, it seems that prepared statements are, for the most part, used in the database language and parameterized queries are mainly used in the programming language connecting to the database, although I have seen exceptions to this rule.

The problem is that asking about the difference between prepared statement and parameterized query brings a lot of confusion. Their purpose is admittedly the same, but their methodology seems distinct. Yet, there are sources indicating that both are the same. MySQLdb and Psycopg2 seem to support parameterized queries but don’t support prepared statements (e.g. here for MySQLdb and in the TODO list for postgres drivers or this answer in the sqlalchemy group). Actually, there is a gist implementing a psycopg2 cursor supporting prepared statements and a minimal explanation about it. There is also a suggestion of subclassing the cursor object in psycopg2 to provide the prepared statement manually.

I would like to get an authoritative answer to the following questions:

  • Is there a meaningful difference between prepared statement and parameterized query? Does this matter in practice? If you use parameterized queries, do you need to worry about prepared statements?

  • If there is a difference, what is the current status of prepared statements in the Python ecosystem? Which database adapters support prepared statements?

解决方案

  • Prepared statement: A reference to a pre-interpreted query routine on the database, ready to accept parameters

  • Parametrized query: A query made by your code in such a way that you are passing values in alongside some SQL that has placeholder values, usually ? or %s or something of that flavor.

The confusion here seems to stem from the (apparent) lack of distinction between the ability to directly get a prepared statement object and the ability to pass values into a 'parametrized query' method that acts very much like one... because it is one, or at least makes one for you.

For example: the C interface of the SQLite3 library has a lot of tools for working with prepared statement objects, but the Python api makes almost no mention of them. You can't prepare a statement and use it multiple times whenever you want. Instead, you can use sqlite3.executemany(sql, params) which takes the SQL code, creates a prepared statement internally, then uses that statement in a loop to process each of your parameter tuples in the iterable you gave.

Many other SQL libraries in Python behave the same way. Working with prepared statement objects can be a real pain, and can lead to ambiguity, and in a language like Python which has such a lean towards clarity and ease over raw execution speed they aren't really the greatest option. Essentially, if you find yourself having to make hundreds of thousands or millions of calls to a complex SQL query that gets re-interpreted every time, you should probably be doing things differently. Regardless, sometimes people wish they could have direct access to these objects because if you keep the same prepared statement around the database server won't have to keep interpreting the same SQL code over and over; most of the time this will be approaching the problem from the wrong direction and you will get much greater savings elsewhere or by restructuring your code.*

Perhaps more importantly in general is the way that prepared statements and parametrized queries keep your data sanitary and separate from your SQL code. This is vastly preferable to string formatting! You should think of parametrized queries and prepared statements, in one form or another, as the only way to pass variable data from your application into the database. If you try to build the SQL statement otherwise, it will not only run significantly slower but you will be vulnerable to other problems.

*e.g., by producing the data that is to be fed into the DB in a generator function then using executemany() to insert it all at once from the generator, rather than calling execute() each time you loop.

tl;dr

A parametrized query is a single operation which generates a prepared statement internally, then passes in your parameters and executes.

这篇关于Python中的准备语句和参数化查询之间的混淆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆