InnoDB 如何存储字符列? [英] How does InnoDB store character columns?

查看:26
本文介绍了InnoDB 如何存储字符列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题仅解决了如何将短"CHARVARCHAR 列存储在 InnoDB 表中.

This Question addresses only how 'short' CHAR and VARCHAR columns are stored in an InnoDB table.

  • CHAR(10) 列是否正好占用 10 个字节?
  • 尾随空格会怎样?
  • 每个字符需要超过 1 个字节的字符集怎么办?
  • VARCHAR(10)CHAR(10) 有何不同?
  • EXPLAIN 意味着所有索引的 varchars 都包含一个 2 字节长度的字段.真的是2个字节吗?或者它可能是 1 个字节?(参见 key_len).
  • 不同的 ROW_FORMAT 怎么样?
  • Does a CHAR(10) column occupy exactly 10 bytes?
  • What happens with trailing blanks?
  • What about character sets that need more than 1 byte per character?
  • How does VARCHAR(10) differ from CHAR(10)?
  • EXPLAIN implies that all indexed varchars contain a 2-byte length field. Is it really 2 bytes? Or might it be 1 byte? (cf key_len).
  • What about different ROW_FORMATs?

本问题未涉及(以免过于宽泛):

Not covered in this Question (to keep it from being too broad):

  • TEXT 怎么样.
  • 255、191、页外存储等呢?
  • 以 char/varchar 开头的索引会发生什么.(想一想:去除公共前缀.)
  • 当涉及到 MEMORY 临时表时,char/varchar 会发生什么.另外,8.0 版发生了哪些变化.
  • ROW_FORMAT 对较长的字符串列有重大影响,主要是决定何时使用页外存储.
  • What about TEXT.
  • What about 255, 191, off-page storage, etc.
  • What happens in an index starting with a char/varchar. (Think: removal of common prefix.)
  • What happens with char/varchar when involved in a MEMORY temp table. Also, what changes happen in version 8.0.
  • ROW_FORMAT has a significant impact on longer string columns, primarily in deciding when off-page storage is used.

推荐答案

来自 MySQL 文档:

CHARVARCHAR值的区别在于它们的存储方式,CHAR(10)无论如何都需要10个字节的存储空间你使用了很多字符,因为数据用空格右填充,VARCHAR (10) 只需要 1 个字节(在 1 个字节字符集中)+ 长度前缀(当长度为 255 或更少时为 1,2否则...我不知道为什么 key_len 为 EXPLAIN 添加 2 个字节)

The difference between CHAR and VARCHAR values is the way they are stored, CHAR (10) requires 10 bytes of storage no matter how many characters you use because the data is right-padded with spaces, VARCHAR (10) only takes 1 byte (in 1 byte character set) + length prefix (1 when the length is 255 or less, 2 otherwise... I don't know why key_len for EXPLAIN add 2 bytes)

我不明白你对尾随空格的意思,尽管我可以想象你指的是多余的尾随空格,VARCHAR 这些被截断并带有警告,同时在 CHAR 列这些空格被静默截断,这有一定的意义,因为 CHAR 存储在末尾带有尾随空格.

I don't understand what you mean with trailing blanks, although I can imagine you are referring to the excess of trailing spaces, with VARCHAR these are truncated with a warning, meanwhile in CHAR columns these spaces are truncated silently, this has some sense cause CHAR are stored with trailing blanks at the end.

关于此链接中的字符集 您可以看到 CHARVARCHAR 的字符数是相同的,尽管您的存储需要每个字符 1 到 4 个字节,here 是支持的字符集列表和 这里每个字符的字节数.

Regarding character set in this link you can see that the number of characters for the CHAR or VARCHAR is the same, although, your storage will require from 1 to 4 bytes per character, here is the list of supported character set and here the bytes per character.

我读到的 InnoDB

冗余行格式特征:

在内部,InnoDB 以固定长度格式存储固定长度的字符列,例如 CHAR(10).InnoDB 不会从 VARCHAR 列中截断尾随空格.
Internally, InnoDB stores fixed-length character columns such as CHAR(10) in a fixed-length format. InnoDB does not truncate trailing spaces from VARCHAR columns.

InnoDB 将长度大于或等于 768 字节的固定长度字段编码为可变长度字段,可以在页外存储.例如,如果字符集的最大字节长度大于 3,则 CHAR(255) 列可以超过 768 个字节,就像 utf8mb4 一样.
InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

COMPACT 行格式特征:

  • 在内部,对于非可变长度字符集,InnoDB 存储固定长度的字符列,例如固定长度中的 CHAR(10)格式.

  • Internally, for nonvariable-length character sets, InnoDB stores fixed-length character columns such as CHAR(10) in a fixed-length format.

InnoDB 不会从 VARCHAR 中截断尾随空格列.

InnoDB does not truncate trailing spaces from VARCHAR columns.

在内部,对于可变长度字符集,例如 utf8mb3 和utf8mb4,InnoDB 尝试通过修剪将 CHAR(N) 存储在 N 个字节中尾随空格.如果 CHAR(N) 列值的字节长度超过N 字节,InnoDB 将尾随空格修剪到最小的列值字节长度.CHAR(N) 列的最大长度是最大字符字节长度×N.

Internally, for variable-length character sets such as utf8mb3 and utf8mb4, InnoDB attempts to store CHAR(N) in N bytes by trimming trailing spaces. If the byte length of a CHAR(N) column value exceeds N bytes, InnoDB trims trailing spaces to a minimum of the column value byte length. The maximum length of a CHAR(N) column is the maximum character byte length × N.

InnoDB 为 CHAR(N) 保留最少 N 个字节.保留在许多情况下,最小空间 N 使列更新能够在放置而不会导致索引页面的碎片.通过比较,对于 ROW_FORMAT=REDUNDANT,CHAR(N) 列占据最大值字符字节长×N.

InnoDB reserves a minimum of N bytes for CHAR(N). Reserving the minimum space N in many cases enables column updates to be done in place without causing fragmentation of the index page. By comparison, for ROW_FORMAT=REDUNDANT, CHAR(N) columns occupy the maximum character byte length × N.

InnoDB 对大于或等于 768 字节的固定长度字段进行编码长度为可变长度字段,可以在页外存储.例如,一个 CHAR(255) 列可以超过 768 个字节,如果最大字符集的字节长度大于 3,因为它与utf8mb4.

InnoDB encodes fixed-length fields greater than or equal to 768 bytes in length as variable-length fields, which can be stored off-page. For example, a CHAR(255) column can exceed 768 bytes if the maximum byte length of the character set is greater than 3, as it is with utf8mb4.

ROW_FORMAT=DYNAMIC 和 ROW_FORMAT=COMPRESSED 处理 CHAR 存储在与 ROW_FORMAT=COMPACT 相同.

ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED handle CHAR storage in the same way as ROW_FORMAT=COMPACT.

...

DYNAMIC 和 COMPRESSED 行格式是 COMPACT 行格式的变体,因此处理 CHAR 存储的方式与 COMPACT 行格式相同

DYNAMIC and COMPRESSED row formats are variations of the COMPACT row format and therefore handle CHAR storage in the same way as the COMPACT row format

这篇关于InnoDB 如何存储字符列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆