MySQL 索引 char(1) 列 [英] MySQL indexing char(1) columns

查看:86
本文介绍了MySQL 索引 char(1) 列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含要优化的复杂查询的表,我阅读了大部分关于 MySQL 索引的文档......但在这种情况下我不确定该怎么做:

数据结构:

-- 请不要评论字段类型和名称,它是外包项目.创建表格项目(record_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,solid CHAR(1) NOT NULL, -- 只有 'Y','N' 值可选 CHAR(1) NULL, -- 只有 'Y','N', NULL 值数据文本);

查询:

SELECT * FROM 项目WHERE record_id != 88AND 实体 = 'Y'AND 可选 !='N' -- 'Y' OR NULL

当然还有额外的连接和相关数据,但这是最大的过滤器.

在以下场景中:
- 200 000+ 条记录,
- 10%(全部)solid = 'Y',
- 10%(全部)带有 optional !='N',

这个查询的好索引是什么?
或更准确地说:

  • 第一个检查记录 != 88 是否会以任何方式减慢他们的查询速度?
    (它只排除一个结果......?)

  • 哪个更快(optional !='N')或('optional' = 'Y' OR 'optional' iS NULL)
    如上所述,optional = 'N' 是总数的 10%.

  • 索引一个只有 2 个可能值的 CHAR(1) 列有什么特别的吗?

  • 我可以使用这个索引吗(record_id、solid、可选)?

  • 我可以为特定值创建索引吗(solid = 'Y',可选 !='N')?


按照@Jack 的要求,当前 EXPLAIN 结果(总共 30 000 行,有 20 个结果):

<前>+--------------+-------+--------------+---------+---------+------+-------+--------------+|选择类型 |类型 |可能的关键|关键|密钥长度 |参考 |行 |额外 |+--------------+-------+--------------+---------+---------+------+-------+--------------+|主要 |范围|主要 |主要 |4 |空 |16228 |使用 where |+--------------+-------+--------------+---------+---------+------+-------+--------------+

解决方案

这是一个有趣的问题.总体而言,您的查询的估计选择性约为 1%.因此,如果 100 条记录适合一个页面,那么您会假设每个页面仍然需要读取,即使有索引.因为记录非常小(取决于 data),所以这很有可能.从这个角度来看,索引是不值得的.

在以下情况下索引是值得的.第一种是当索引是覆盖索引时,这意味着您可以使用索引中的所有列来满足查询.例如:

选择计数(*)来自项目WHERE record_id != 88 AND solid = 'Y' AND optional !='N' -- 'Y' OR NULL

索引位于solid,可选,record_id 的位置.查询不需要返回到原始数据页.

另一种情况是索引是主(或聚集)索引.数据按该顺序存储,因此获取有限数量的结果将减少查询的读取开销.这样做的缺点是更新和插入的成本更高,因为数据实际上必须移动.

在您的情况下,我最好的猜测是索引没有用,除非 data 非常大(在千字节范围内).

I have a table with a complex query that I look for optimization, I read most of the documentation on MySQL indexing .. but in this case I`m not sure what to do:

Data structure:

-- please, don't comment on the field types and names, it is outsourced project.

CREATE TABLE items(
  record_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  solid CHAR(1) NOT NULL, -- only 'Y','N' values
  optional CHAR(1) NULL, -- only 'Y','N', NULL values
  data TEXT
 );

Query:

SELECT * FROM items
WHERE record_id != 88
AND solid = 'Y'
AND optional !='N'  -- 'Y' OR NULL

Of course there are extra joins and related data, but this are the biggest filters.

In the scenario of:
- 200 000+ records,
- 10% (from all) with solid = 'Y',
- 10% (from all) with optional !='N',

What would be good index for this query ?
or more precisely:

  • does the first check record != 88 slows they query in any way ?
    (it only eleminates one result...?)

  • which is faster (optional !='N') or ( 'optional' = 'Y' OR 'optional' iS NULL )
    as mentioned above optional = 'N' are 10% of the total count.

  • is there anything special for indexing a CHAR(1) column with only 2 possible values?

  • can I use this index (record_id, solid, optional)?

  • can I create a index for specific value (solid = 'Y', optional !='N')?


As @Jack requested, current EXPLAIN result (out of 30 000 total rows with 20 results):

+-------------+-------+--------------+---------+---------+------+-------+-------------+
| select_type | type  | possible_key |   key   | key_len | ref  | rows  |    Extra    |
+-------------+-------+--------------+---------+---------+------+-------+-------------+
| PRIMARY     | range | PRIMARY      | PRIMARY |       4 | NULL | 16228 | Using where |
+-------------+-------+--------------+---------+---------+------+-------+-------------+

解决方案

This is an interesting question. Overall, your query has an estimated selectivity of about 1%. So, if 100 records fit on a page, then you would assume that each page would still have to be read, even with the index. Because a record is so small (depending on data that is), this is quite likely. From that perspective, an index is not worth it.

An index would be worth it under the following circumstances. The first is when the index is a covering index, meaning that you can satisfy the query with all the columns in the index. For example:

select count(*)
FROM items
WHERE record_id != 88 AND solid = 'Y' AND optional !='N'  -- 'Y' OR NULL

Where the index is on solid, optional, record_id. The query doesn't need to go back to the original data pages.

Another case would be when the index is a primary (or clustered) index. The data is stored in that order, so fetching a limited number of results would reduce the read overhead of the query. The downside to this is that updates and inserts are more expensive, because data actually has to move.

My best guess in your case is that an index would not be useful, unless data is quite large (in the kilobyte range).

这篇关于MySQL 索引 char(1) 列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆