UPDATE与UPDATE WHERE [英] UPDATE vs UPDATE WHERE

查看:437
本文介绍了UPDATE与UPDATE WHERE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多行的表,我想定期在其中使用cron将一列设置为0.

I have a table with many rows, where I periodically want to set one column to 0 using a cron.

在所有行上执行UPDATE(即不执行WHERE子句)还是仅执行UPDATE WHERE mycolumn != 0是什么?

What is faster / less memory consuming, doing an UPDATE on all rows (ie. no WHERE clause) or doing an UPDATE only WHERE mycolumn != 0?

推荐答案

正如原始帖子的注释中所指出的,它取决于几件事(索引,数据库引擎,存储介质的类型,可用的高速缓存等等).

As noticed in comments on the original post, it depends on several things (index, database engine, type of storage media, available cache memory, etc.).

我们可以做出有根据的猜测:

We could make an educated guess that:

a)除非列上有索引,否则我们应该始终进行全表扫描(并且我不建议仅出于查询目的而推荐一个索引,因为您将降低在线写入的数量,从而受益于脱机)过程);

a) We should always have a full-table scan unless we have an index on the column (and I'd not recommend one just for the sake of this query, as you will penalize down your online writes to benefit an offline process);

b)检查值的成本"由于不写不变记录的节省而被高额摊销(除非数据库隐式地这样做,这种情况可能会或可能不会发生) 记录将为零.

b) The "cost" for checking the values is highly amortized by the savings of not writing unchanged records (unless the database implicitly does so, which may or may not happen) if the majority of records will be zeroes.

但是假设开始堆积起来,所以我宁愿衡量.要玩一点,我已经:

but assumptions start to pile up, so I'd rather measure instead. To play a bit, I've:

  • 使用状态"数字列创建测试表

  • Created a test table with a "status" numeric column

用几百万条记录填充它(例如,使用 https://stackoverflow.com中的脚本之类的脚本/a/17268740/64635 )

Filled it with a few million records (e.g., using a script like the one in https://stackoverflow.com/a/17268740/64635)

使用不同的值对其进行设置,然后尝试将UPDATE列设置为0(带和不带WHERE).

Set it up with different values, then tried to UPDATE the column to 0, with and without the WHERE.

我的结果(可能与您的不同)是,如果确实有非零记录,那么WHERE查询会更快. .例如,在使用以下任意一种设置表后

My results (which may differ from yours) were that the WHERE query was way faster if there were indeed few non-zero records. E.g., after setting up the table with either of

UPDATE myTable SET myColumn = 1;                /* All values non-zero (1) */
UPDATE myTable SET myColumn = FLOOR(RAND()*10); /* ~90% values non-zero */

WHERE和非WHERE都更新为0的速度很慢(它们之间没有显着差异,这意味着上面的"a"是正确的),而在任何

both WHERE and non-WHERE updates to 0 were slow (and no noticeable difference between them, implying "a" above is true), whereas after any of

UPDATE myTable SET myColumn = 0;                      /* All values zero */
UPDATE myTable SET myColumn = IF(id % 500 = 0, 1, 0); /* 99.8% values zero */

带有WHEREUPDATE更快(如"b"所示).

the UPDATE with WHERE was insanely faster (as implied by "b").

我建议您在设置中尝试这些测试(例如,如果愿意,甚至进行其他测试,包括索引)(例如,创建一个单独的表并运行此类测试),并考虑您的数据集(测量/估算百分比)在您的cron作业运行时将不为零的记录数).请记住,您可能想针对成本/可用性进行优化(包括您的时间作为成本),而不是在Universe中找到性能最高的解决方案(这可能不划算),并且您一定会找到最好的解决方案.祝你好运!

I'd recommend trying these tests (and even others, including the index if you really wish) on your setup (e.g., creating a separate table and running tests like these) and consider your data set (measure/estimate the % of records that will be non-zero when your cron job runs). Keep in mind you likely want to optimize for cost/availability (including your time as a cost) instead of finding the absolutely most performant solution in the universe (which is likely not cost-effective), and you will surely find the best solution. Good luck!

这篇关于UPDATE与UPDATE WHERE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆