在SQL表的列中分离大小相关的数据是否更有效? [英] Is it more efficient to separate large and small related data in a SQL table's column?

查看:96
本文介绍了在SQL表的列中分离大小相关的数据是否更有效?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个MySQL数据库表,该表的列类型为varchar(386).之所以选择此字符数,是因为我预先计算了最长条目的字符.我目前有40万个条目,但预计会随着时间增加.

I have a MySQL database table that has a column of type varchar(386). I chose this number of characters because I counted the characters of the longest entry beforehand. I have 400,000 entries currently, but it is expected to increase with time.

我进行了一些测试,发现大约390,000个条目仅使用60个或更少的字符,而最后10,000个条目最多使用386个字符.

I ran a few tests and found out that about 390,000 entries only use 60 or less characters whereas the last 10,000 entries use up to 386 characters.

我应该将10,000个大条目分开到一个单独的表中吗?我将如何实施呢?从长远来看,这会提高我的查询速度效率吗?

Should I separate the 10,000 large entries into a separate table? How would I go about implementing that? Would this increase my querying speed efficiency in the long run?

推荐答案

VARCHAR与表内联存储. VARCHAR在大小合理的情况下会更快,其折衷会更快,这取决于您的数据和硬件,您希望使用数据对现实世界场景进行基准测试.

VARCHAR is stored inline with the table. VARCHAR is faster when the size is reasonable, the tradeoff of which would be faster depends upon your data and your hardware, you'd want to benchmark a realworld scenario with your data.

VARCHARVARBINARY列中可以存储的有效最大字节数取决于65,535 bytes的最大行大小,该大小在所有列之间共享.

The effective maximum number of bytes that can be stored in a VARCHAR or VARBINARY column is subject to the maximum row size of 65,535 bytes, which is shared among all columns.

例如,VARCHAR(255)列可以容纳最大长度为255个字符的字符串.假设该列使用latin1字符集(每个字符一个字节),则实际需要的存储量是字符串的长度(L),再加上一个字节来记录字符串的长度.对于字符串'abcd'L4,存储要求为五个字节.如果改为声明同一列使用ucs2双字节字符集,则存储要求为10个字节:'abcd'的长度为8个字节,并且该列需要两个字节来存储长度,因为最大长度更大大于255(最大为510 bytes).

For example, a VARCHAR(255) column can hold a string with a maximum length of 255 characters. Assuming that the column uses the latin1 character set (one byte per character), the actual storage required is the length of the string (L), plus one byte to record the length of the string. For the string 'abcd', L is 4 and the storage requirement is five bytes. If the same column is instead declared to use the ucs2 double-byte character set, the storage requirement is 10 bytes: The length of 'abcd' is eight bytes and the column requires two bytes to store lengths because the maximum length is greater than 255 (up to 510 bytes).

对于较大的数据,请考虑使用TEXTBLOB. TEXTBLOB列在NDB存储引擎中以不同的方式实现,其中TEXT列中的每一行都由两个独立的部分组成.其中之一具有固定大小(256 bytes),并且实际上存储在原始表中.另一个包含超过256 bytes的任何数据,这些数据存储在隐藏表中.第二个表中的行始终为2,000 bytes长.这意味着如果size <= 256TEXT列的大小为256(其中size表示行的大小);否则,大小为256 + size + (2000 – (size – 256) % 2000).

For larger data, consider using TEXT or BLOB. TEXT and BLOB columns are implemented differently in the NDB storage engine, wherein each row in a TEXT column is made up of two separate parts. One of these is of fixed size (256 bytes), and is actually stored in the original table. The other consists of any data in excess of 256 bytes, which is stored in a hidden table. The rows in this second table are always 2,000 bytes long. This means that the size of a TEXT column is 256 if size <= 256 (where size represents the size of the row); otherwise, the size is 256 + size + (2000 – (size – 256) % 2000).

http://dev.mysql.com/doc/refman/5.6/en/storage-requirements.html

如果您很少在查询中使用该字段,则取决于您的数据库关系.例如,有关其他信息.创建分离的表是不错的选择(规范化).

Depends on your database relation, if you rarely using that fields in query. For example for additional info. Create separated table is good options (normalize).

注释: VARCHARCHAR不同.如果创建VARCHAR(250)并仅在其中插入20个字符,则5 bytes + LCHAR(250)不同,相同条件下的250 bytes + L.

NOTES : VARCHAR is different with CHAR. If you create VARCHAR(250) and insert just 20 characters on it then it will take 5 bytes + L different with CHAR(250), it will take 250 bytes + L for same condition.

这篇关于在SQL表的列中分离大小相关的数据是否更有效?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆