查询的MySQL性能在where子句中添加列 [英] MySQL performance of query making addition of columns in where clause

查看:151
本文介绍了查询的MySQL性能在where子句中添加列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个查询在 WHERE 子句中添加了几个列值。我无法在单个列中预先计算此添加项,因为要使用的列组合因查询而异。我的问题是我的桌子很大(几亿行)而且表现非常糟糕。

I have a query making an addition of several column values in the WHERE clause. I can't precompute this addition in a single column because the combination of columns to use varies between queries. My problem is that my table is very large (several hundreds of millions of rows) and the performances very bad.

示例表:

+---------+------------+--------+--------+--------+--------+
| tableId | categoryId | value1 | value2 | value3 | value4 |
+---------+------------+--------+--------+--------+--------+
|       1 |          1 |      1 |      0 |      5 |      7 |
|       2 |          1 |      8 |      1 |      7 |      0 |
|       3 |          1 |     10 |      5 |      0 |     20 |
|       4 |          2 |      0 |     15 |      0 |     22 |
|       5 |          2 |     20 |      0 |     11 |      0 |
+---------+------------+--------+--------+--------+--------+

示例查询:

SELECT * FROM myTable WHERE categoryId = 1 AND (value1 + value2 + value3 + value4) > 9;
SELECT * FROM myTable WHERE categoryId = 1 AND (value1 + value3 + value4) > 5;

提高此类查询效果的最佳策略是什么? (编辑:我已经在 categoryId 上有一个索引,这没有帮助)

What would be the best strategy for improving performances of such queries? (edit: I already have an index on categoryId, this does not help)

使用索引帮助这样的疑问?那么我是否必须为所有可能的列组合创建所有可能的索引?结果索引不会非常大吗?

Does using an index help for such queries? Would I then have to create all possible indexes for all possible combinations of columns? Wouldn't the resulting indexes be very very large?

ALTER TABLE myTable
ADD INDEX(categoryId, value1),
ADD INDEX(categoryId, value2),
ADD INDEX(categoryId, value3),
ADD INDEX(categoryId, value4),
ADD INDEX(categoryId, value1, value2),
ADD INDEX(categoryId, value1, value3),
ADD INDEX(categoryId, value1, value4),
etc

或者可能创建一个链接表,其中布尔值字段指定使用了哪些列?但这会产生一个有几十亿行的表,不确定这是否更好......

Or maybe creating a link table, with boolean value fields specifying which columns were used? But that would result in a table with several billions rows, not sure this is better...

+---------+-----------+-----------+-----------+-----------+----------+
| tableId | useValue1 | useValue2 | useValue3 | useValue4 | valueSum |
+---------+-----------+-----------+-----------+-----------+----------+
|       1 |         1 |         1 |         1 |         1 |       13 |
|       1 |         1 |         1 |         1 |         0 |        6 |
|       1 |         1 |         1 |         0 |         0 |        1 |
|       1 |         1 |         1 |         0 |         1 |        8 |
|       1 |         1 |         0 |         1 |         1 |       13 |
|       1 |         1 |         0 |         1 |         0 |        6 |
|       1 |         1 |         0 |         0 |         0 |        1 |
|       1 |         1 |         0 |         0 |         1 |        8 |
|       1 |         0 |         1 |         1 |         1 |       12 |
|       1 |         0 |         1 |         1 |         0 |        5 |
|       1 |         0 |         1 |         0 |         0 |        0 |
|       1 |         0 |         1 |         0 |         1 |        7 |
|       1 |         0 |         0 |         1 |         1 |       12 |
|       1 |         0 |         0 |         1 |         0 |        5 |
|       1 |         0 |         0 |         0 |         1 |        7 |
+---------+-----------+-----------+-----------+-----------+----------+

使用索引:

ALTER TABLE linkTable INDEX(tableId, useValue1, useValue2, useValue3, useValue4, valueSum);

还有其他想法吗?

推荐答案

根据关于整体数据库设计的后续问题的答案,结论是:


  • 我的所有数据类型和索引都是正确的。

  • 我的枚举列设计不是非常优雅,但适用于基于行的数据库,如MySQL,并在这种引擎上提供最佳性能。

  • 为了真正解决这个性能问题,我应该转到一个专栏 - 基于数据库,使用更好的设计,如我的其他问题的评论中所述(其中聚合的数据将在同一列但有几行)。

这篇关于查询的MySQL性能在where子句中添加列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆