在sql中插入,删除和更新期间会发生什么? [英] What happens during the insertion, deletion and update in sql?

查看:174
本文介绍了在sql中插入,删除和更新期间会发生什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解一些关于mysql架构的事情。
1. sql进程如何在索引表中插入,删除,更新操作?
2.据说当索引页不在缓冲池中时,只在更改缓冲区中进行更改。因此,如果在缓冲池加载相关索引页面之后进行了更改,那么它也必须更改磁盘中的同一页面。对?那么必须在三个不同的地方进行操作?
3.如何将NULL值编入索引?它们将存储在b +树中的哪个位置?
4.如果我们更新聚集索引的数据,那么何时在磁盘中更新?
5.批量加载期间会发生什么?

I would like to know a few things regarding mysql architecture. 1. How sql process insert, delete, update operations in an indexed table? 2. It is said that changes are only made in the change buffer when the index page is not in the buffer pool. So if changes are made after the buffer pool loads the concerned index page, then it has to alter the same page in disk as well. right? So an operation has to be done in three different places? 3. How NULL values are indexed? where would they be stored in a b+tree? 4. If we update a data which is the clustered index, then when will it be updated in the disk? 5. What happens during bulk loading?

推荐答案

如何处理插入/更新/删除...

How to process insert/update/delete...


  1. 获取(和缓存)定位要更新/删除的行所需的索引块,或者新行所需的块()将被插入。

  2. 获取数据块。请注意,所有索引都包含 PRIMARY KEY ,它与数据聚集在一起。

  3. 修改数据块以反映变化。还处理记住旧数据 - 如果是最终 ROLLBACK

  4. 更新唯一索引块(包括PK) 。

  5. 在更改缓冲区中存储非唯一索引更改。 (如您所述。)

  1. Fetch (and cache) index block(s) needed for locating the row(s) to be updated/deleted, or the blocks where new row(s) will be inserted.
  2. Fetch the data block(s). Note that all indexes include the PRIMARY KEY, which is clustered with the data.
  3. Modify the data block(s) to reflect the changes. Also deal with remembering the old data -- in case of an eventual ROLLBACK.
  4. Update unique index blocks (that includes the PK).
  5. Store non-unique index changes in the change buffer. (As you noted.)

更改缓冲区设计为对实际索引块透明。

The change buffer is designed to be a 'transparent' to the actual index blocks.


  • 索引查找将始终做正确的事,无论条目是否在CB中。

  • 将CB条目折回到实际索引块中是在后台和/或用完房间时完成的。 (我认为CB默认为buffer_pool的1/4。)

  • 在事务日志中存储了足够的信息,因此崩溃不会丢失挂起的索引更新。 / li>
  • 显然,CB是为了表现而发明的。索引更新可以延迟,同时比需要更新的索引块(16KB)占用更少的空间(通常只有几十个字节)。多个更改(通常)可以应用于单个索引块 - 这是主要的节省。但请注意,由于随机性,UUID,MD5等无法充分利用CB。当前日期时间/时间戳的非唯一索引是CB的缓冲确实闪耀的情况。

(对不起,我对CB对你要求的水平有点模糊。我建议你阅读代码。)

(Sorry, my knowledge of the CB is a bit vague for the level at which you are asking. I suggest you read the code.)

NULL ...我相信它被视为一个单独的值,它在B + Tree中的所有非空值之前进行排序。但是为了混淆这个问题,有一个标志确定空值是否被视为彼此相等。 PRIMARY / UNIQUE 键有限制。

NULL... I believe that is treated as a separate value that sorts before all non-null values in the B+Tree. But to confuse the issue, there is a flag determining whether nulls are treated as equal to each other. And there are restrictions on PRIMARY/UNIQUE keys.

与NULL相关...在 DATE 或<$ c $的某些变体/函数上执行 PARTITION BY RANGE 时c> DATETIME ,无效日期变为 NULL ,它明确存储在第一个分区中。对于为什么分区修剪似乎不起作用,新手常常感到困惑。 (推荐的部分解决方法:让'first'分区为空。)

Related to NULL... When doing PARTITION BY RANGE on some variant/function of DATE or DATETIME, invalid dates turn into NULL, which is explicitly stored in the 'first' partition. Newbies are often puzzled as to why partition pruning does not seem to work. (Recommended partial workaround: have a 'first' partition that is otherwise empty.)

集群 UNIQUE 索引...所有(?)写操作必须检查所有唯一索引,因此CB不参与此类操作。注意:在InnoDB中, PRIMARY KEY 始终是群集且唯一的,并且不能(?)具有 NULL

Clustered and UNIQUE indexes... All(?) write operations must check all unique indexes, hence the CB is not involved with such. Note: In InnoDB, the PRIMARY KEY is always clustered and unique and cannot(?) have NULLs.

批量加载......我发现100行 INSERT 的运行速度是100个<$ c的100倍$ C>插入。 (这是由于解析等原因)但在低级别,批量插入或 LOAD DATA 只是一堆单独的插入。因此,上述讨论适用。

Bulk loading... I find that a 100-row INSERT will run 10 times as fast as 100 individual INSERTs. (This is due to parsing, etc.) But at the low level, a batch insert or LOAD DATA is just a bunch of individual inserts. So, the above discussion applies.

奖金答案......

Bonus answers...

IODKU( INSERT ... ON DUPLICATE KEY UPDATE )几乎遵循上面的1..5步骤。在定位要更新的行时,它会发现是更新还是插入,然后相应地进行。

"IODKU" (INSERT ... ON DUPLICATE KEY UPDATE) is pretty much follows the 1..5 steps above. In locating the row to update, it discovers whether to update or insert, then proceeds accordingly.

REPLACE 真的是 DELETE 的简写,加上 UPDATE 。但请注意此异常...如果表上有两个唯一键,则单行 REPLACE 可能会在插入1行之前删除2行。

REPLACE is really a shorthand for DELETE, plus UPDATE. But note this anomaly... If there are two unique keys on the table, a one-row REPLACE might delete 2 rows before inserting the 1 row.

这篇关于在sql中插入,删除和更新期间会发生什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆