SQL varchar列长度的最佳做法 [英] Best practices for SQL varchar column length

查看:95
本文介绍了SQL varchar列长度的最佳做法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每次设置一个新的SQL表或向现有表添加一个新的varchar列时,我都想知道一件事:length的最佳值是什么.

Every time is set up a new SQL table or add a new varchar column to an existing table, I am wondering one thing: what is the best value for the length.

因此,可以说,您有一个名为name的列,其类型为varchar.因此,您必须选择长度.我想不出一个名称> 20个字符,但您永远不会知道.但是,我总是使用四舍五入到下一个2 ^ n数字,而不是使用20.在这种情况下,我将选择32作为长度.我这样做是因为,从计算机科学家的角度来看,数字2 ^ n在我看来比其他数字更even,我只是假设下面的体系结构可以更好地处理这些数字.

So, lets say, you have a column called name of type varchar. So, you have to choose the length. I cannot think of a name > 20 chars, but you will never know. But instead of using 20, I always round up to the next 2^n number. In this case, I would choose 32 as the length. I do that, because from an computer scientist point of view, a number 2^n looks more even to me than other numbers and I'm just assuming that the architecture underneath can handle those numbers slightly better than others.

另一方面,例如,当您选择创建varchar列时,MSSQL服务器会将默认长度值设置为50.这让我开始思考.为什么是50?只是一个随机数,还是基于平均列长,还是什么?

On the other hand, MSSQL server for example, sets the default length value to 50, when you choose to create a varchar column. That makes me thinking about it. Why 50? is it just a random number, or based on average column length, or what?

也可能-或可能是-不同的SQL Server实现(例如MySQL,MSSQL,Postgres等)具有不同的最佳列长值.

It could also be - or probably is - that different SQL servers implementations (like MySQL, MSSQL, Postgres, ...) have different best column length values.

推荐答案

据我所知,没有DBMS有任何优化"可以使长度为2^nVARCHAR比长度为max的DBMS更好.那不是2的幂.

No DBMS I know of has any "optimization" that will make a VARCHAR with a 2^n length perform better than one with a max length that is not a power of 2.

我认为早期的SQL Server版本实际上对待长度为255的VARCHAR与最大长度更大的VARCHAR不同.我不知道情况是否仍然如此.

I think early SQL Server versions actually treated a VARCHAR with length 255 differently than one with a higher maximum length. I don't know if this is still the case.

对于几乎所有的DBMS,所需的实际存储空间仅取决于您放入其中的字符数,而不取决于您定义的max长度.因此,从存储的角度(并且很可能也是性能的角度),无论您将列声明为VARCHAR(100)还是VARCHAR(500)都没有任何区别.

For almost all DBMS, the actual storage that is required is only determined by the number of characters you put into it, not the max length you define. So from a storage point of view (and most probably a performance one as well), it does not make any difference whether you declare a column as VARCHAR(100) or VARCHAR(500).

您应该看到为VARCHAR列提供的max长度是一种约束(或业务规则),而不是技术/物理上的东西.

You should see the max length provided for a VARCHAR column as a kind of constraint (or business rule) rather than a technical/physical thing.

对于PostgreSQL来说,最好的设置是使用text且没有长度限制,而CHECK CONSTRAINT则将字符数限制为您的业务需求.

For PostgreSQL the best setup is to use text without a length restriction and a CHECK CONSTRAINT that limits the number of characters to whatever your business requires.

如果该要求发生变化,那么更改检查约束要比更改表快得多(因为不需要重写表)

If that requirement changes, altering the check constraint is much faster than altering the table (because the table does not need to be re-written)

Oracle和其他应用程序也可以使用-在Oracle中将是VARCHAR(4000)而不是text.

The same can be applied for Oracle and others - in Oracle it would be VARCHAR(4000) instead of text though.

我不知道VARCHAR(max)与例如SQL Server中的VARCHAR(500).但是显然,与varchar(8000)相比,使用varchar(max)会对性能产生影响.

I don't know if there is a physical storage difference between VARCHAR(max) and e.g. VARCHAR(500) in SQL Server. But apparently there is a performance impact when using varchar(max) as compared to varchar(8000).

请参见此链接(已发布由Erwin Brandstetter作为评论)

See this link (posted by Erwin Brandstetter as a comment)

编辑2013-09-22

关于bigown的评论:

Regarding bigown's comment:

在9.2之前的Postgres版本中(我写初始答案时不可用),对列定义 did 的更改将重写整个表,请参见例如此处.从9.2开始,情况不再如此,快速测试证实,增加具有120万行的表的列大小实际上仅花费了0.5秒.

In Postgres versions before 9.2 (which was not available when I wrote the initial answer) a change to the column definition did rewrite the whole table, see e.g. here. Since 9.2 this is no longer the case and a quick test confirmed that increasing the column size for a table with 1.2 million rows indeed only took 0.5 seconds.

对于Oracle来说,这似乎也是正确的,从更改大表的varchar列所需的时间来看.但是我找不到任何参考.

For Oracle this seems to be true as well, judging by the time it takes to alter a big table's varchar column. But I could not find any reference for that.

对于MySQL 手册说"在大多数情况下,ALTER TABLE会创建原始表的临时副本".我自己的测试证实:在具有120万行的表上运行ALTER TABLE(与我对Postgres的测试相同)以增加列的大小需要1.5分钟.但是,在MySQL中,您可以使用替代方法"来使用检查约束来限制列中的字符数.

For MySQL the manual says "In most cases, ALTER TABLE makes a temporary copy of the original table". And my own tests confirm that: running an ALTER TABLE on a table with 1.2 million rows (the same as in my test with Postgres) to increase the size of a column took 1.5 minutes. In MySQL however you can not use the "workaround" to use a check constraint to limit the number of characters in a column.

对于SQL Server,我对此没有找到明确的声明,但是执行时间增加varchar列的大小(同样是上面的120万行表)表明发生.

For SQL Server I could not find a clear statement on this but the execution time to increase the size of a varchar column (again the 1.2 million rows table from above) indicates that no rewrite takes place.

编辑2017-01-24

似乎我(至少部分地)对SQL Server错误.请参见来自亚伦·伯特兰(Aaron Bertrand)的答案,该结果表明nvarcharvarchar列的声明长度使性能上的巨大差异.

Seems I was (at least partially) wrong about SQL Server. See this answer from Aaron Bertrand that shows that the declared length of a nvarchar or varchar columns makes a huge difference for the performance.

这篇关于SQL varchar列长度的最佳做法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆