MySQL BIGINT(20)与Varchar(31)的性能 [英] MySQL BIGINT(20) vs Varchar(31) performance

查看:1005
本文介绍了MySQL BIGINT(20)与Varchar(31)的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经读过,像primary unique key那样的bigint像23423423423423423637比像961637593864109_412954765521130那样的varchar更好,但是如果有100万行,而我永远不会排序,而只选择/更新一行.对我来说,使用varchar会更舒服,而当性能差异低于30%或任何其他水平时,我会坚持使用.我找不到任何基准.

解决方案

这确实是必须进行衡量的,我们可以根据我们所知道的和所假设的内容做出一些猜测",但这只是猜测. /p>

您不会提及此表是InnoDB还是具有动态行的MyISAM或具有固定长度的行的MyISAM.这将有所作为.

但是对于您发布的值,'961637593864109_412954765521130'(31个字符),假设您使用的是单字节字符集(例如latin1),或者是将那些特定字符编码为单个字节的字符集(例如utf8) ...

对于InnoDB和MyISAM动态格式,该行的额外字节为31 + 1-8 = 24. (BIGINT可以容纳8个字节,VARCHAR(31)值为31个字符将使用32个字节.)

对于具有固定长度行的MyISAM表,每行相差23个字节. (空格用于所有31个字符,并且长度不必存储.)

该主键值也会在每个索引中重复,因此每个索引的空间也会增加.

假设使用BIGINT的表行为120字节,而使用VARCHAR的行为144字节,则增加了 20%.行越大,百分比增加越小,反之亦然.

对于1,000,000行(我想这样说一个meelyun行",就像Evil博士将小指放在嘴角上并说"100万美元")一样,每行总共增加了24个字节大约24MB.

但这并不是那么容易.就InnoDB空间而言,问题在于行如何适合"到块中.平均行大小越大,块中的可用空间就越大.

如果除了将行存储在磁盘上之外不对它们做任何事情,那么实际上只是磁盘空间的增加,以及额外的时间和备份空间.


如果一个块中可以容纳与"120字节"行相同数量的"144字节"行,那么您将不会看到空间上的任何差异.但是,如果一个块中的行较少,那就意味着有更多的块,InnoDB缓冲池中的更多空间,更多的I/O等.


对于单行查询,无论是通过主键值还是通过其他一些唯一索引查找,其差异都可以忽略不计.

如果要处理更大的结果集,那么这是准备结果集的额外内存,还有多余的字节可以传输到客户端等.


如果以这样的方式设计VARCHAR键,使得一起访问的组"行具有相同的键值前导部分,那么使用InnoDB,实际上可能会提高一些性能.那是因为主键是集群键...满足查询所需的行的机会要多于同一块,而不是散布在一堆块中.

相反,如果执行了插入和删除操作,则某些块中会有更多的可用空间. (通过删除,已删除行的空间仍保留在块中;要重新使用该行,您需要插入具有相同键值的行(或至少键值足够近以使其可以落在同一块中的行). .)通过随机插入,我们将获得块分割.

I have read that bigint like 23423423423423423637 for primare unique key is better than varchar like 961637593864109_412954765521130 but how big is the difference when there are let's say 1 million rows when I never will sort but only select/update one row. It would be much more comfortable for me to use varchar and I will stay with that when the performance difference is under 30% or anything. I can't find any benchmark for that.

解决方案

This would really have to be measured, we can make some "guesses" based on what we know, and what we assume, but those are just guesses.

You don't mention whether this table is InnoDB, or MyISAM with dynamic rows, or MyISAM with fixed length rows. That's going to make some difference.

But for values like the one you posted, '961637593864109_412954765521130' (31 characters), assuming you're using a single byte characterset (e.g. latin1), or a characterset that encodes those particular characters into a single byte (e.g. utf8)...

For InnoDB and MyISAM dynamic format, that's 31+1-8=24 extra bytes for that row. (BIGINT fits in 8 bytes, a VARCHAR(31) value of 31 characters will use 32 bytes.)

For MyISAM table with fixed length rows, that would be a difference of 23 bytes per row. (Space is reserved for all 31 characters, and the length doesn't have to be stored.)

That primary key value will also be repeated in every index, so there's also increased space with each index.

Assuming that your table rows are 120 bytes using BIGINT, and the rows are 144 bytes with VARCHAR, that's a 20% increase. The larger your rows, the smaller the percentage increase, and vice versa.

For 1,000,000 rows (I so want to say "one meelyun rows" in the same way that Dr. Evil puts his pinky finger to the corner of this mouth and says "one million dollars") that extra 24 bytes per row totals around 24MB.

But it's not really that easy. In terms of InnoDB space, it's a matter of how may rows "fit" into a block. The larger the average row size, the larger the amount of free space will be in a block.

If you don't do anything with the rows except store them on disk, then it's really just an increase in disk space, and extra time and space for backups.


If the same number of "144 byte" rows fit in a block as "120 byte" rows, then you aren't going to see any difference in space. But if fewer rows fit in a block, that's more blocks, more space in the InnoDB buffer pool, more i/o, etc.


For queries of a single row, either by primary key value, or by some other unique index lookup, the difference is going to be negligible.

If you are dealing with larger resultsets, then that's extra memory for preparing the resultset, and extra bytes to transfer to the client, etc.


If the VARCHAR key is designed in such a way that "groups" of rows that are accessed together have the same leading portion of the key value, then with InnoDB, there may actually be some performance improvement. That's because the primary key is the cluster key... much better chance of the rows needed to satisfy a query are in the same block, rather than being spread out over a bunch of blocks.

The converse is if there are inserts and deletes performed, there will be more free space in some blocks. (With the deletes, the space for deleted rows remains in the block; to get that reused, you'd need to insert a row that had the same key value (or at least a key value close enough that it lands in the same block.) And with random inserts, we're going to get block splits.

这篇关于MySQL BIGINT(20)与Varchar(31)的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆