PyMySQL在一个查询中有不同的更新? [英] PyMySQL different updates in one query?

查看:234
本文介绍了PyMySQL在一个查询中有不同的更新?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我有一个Python脚本,该脚本可处理大约350,000个数据对象,并且根据某些测试,它需要更新代表MySQl db中每个对象的行.我也使用pymysql,因为它的麻烦最少,尤其是在通过大型选择查询发送时(带有where column IN (....)子句的select语句可以包含100,000个以上的值).

So I have a python script that goes through roughly 350,000 data objects, and depending on some tests, it needs to update a row which represents each one of those objects in a MySQl db. I'm also using pymysql as I've had the least trouble with it especially when sending over large select queries (select statements with where column IN (....) clause that can contain 100,000+ values).

由于每行的每次更新可以不同,因此每条更新语句也不同.例如,对于一行我们可能要更新first_name,但对于另一行我们要保持first_name不变,而我们想更新last_name.

Since each update for each row can be different, each update statement is different. For example, for one row we might want to update first_name but for another row we want to leave first_name untouched and we want to update last_name.

这就是为什么我不想使用采用一个通用更新语句的cursor.executemany()方法,然后为它提供值的原因,但是正如我所提到的,每个更新都是不同的,因此拥有一个通用更新语句不会确实适合我的情况.我也不想通过网络单独发送超过350,000条更新语句.无论如何,我可以将所有更新语句打包在一起并立即发送它们吗?

This is why I don't want to use the cursor.executemany() method which takes in one generic update statement and you then feed it the values however as I mentioned, each update is different so having one generic update statement doesn't really work for my case. I also don't want to send over 350,000 update statements individually over the wire. Is there anyway I can package all of my update statements together and send them at once?

我尝试将它们全部包含在一个查询中,并使用cursor.execute()方法,但是它似乎并没有更新所有行.

I tried having them all in one query and using the cursor.execute() method but it doesn't seem to update all the rows.

推荐答案

SQL#1:CREATE TABLE t具有可能需要更改的任何列.将它们全部设置为NULL(而不是NOT NULL).

SQL #1: CREATE TABLE t with whatever columns you might need to change. Make all of them NULL (as opposed to NOT NULL).

SQL#2:对所有所需的更改进行批量INSERT(或LOAD DATA).例如,如果仅更改first_name,请填写idfirst_name,但其他列为NULL.

SQL #2: Do a bulk INSERT (or LOAD DATA) of all the changes needed. Eg, if changing only first_name, fill in id and first_name, but have the other columns NULL.

SQL#3-14:

UPDATE real_table
  JOIN t  ON t.id = real_table.id
  SET real_table.first_name = t.first_name
  WHERE t.first_name IS NOT NULL;
# ditto for each other column.

除#1之外的所有SQL都将很耗时.并且,由于UPDATE需要构建撤消日志,因此可能会超时或出现其他问题.如有必要,请参见有关分块的讨论.

All SQLs except #1 will be time-consuming. And, since UPDATE needs to build a undo log, it could timeout or otherwise be problematical. See a discussion of chunking if necessary.

如有必要,请使用COALESCE()GREATEST()IFNULL()等功能.

If necessary, use functions such as COALESCE(), GREATEST(), IFNULL(), etc.

质量UPDATEs通常表示架构设计不佳.

Mass UPDATEs usually imply poor schema design.

(如果Ryan跳入答案"而不是评论",他可能应该获得赏金".)

(If Ryan jumps in with an 'Answer' instead of just a 'Comment', he should probably get the 'bounty'.)

这篇关于PyMySQL在一个查询中有不同的更新?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆