我应该使用内联 varchar(max) 列还是将其存储在单独的表中? [英] Should I use an inline varchar(max) column or store it in a separate table?

查看:54
本文介绍了我应该使用内联 varchar(max) 列还是将其存储在单独的表中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 MS SQL Server 2005 中创建一个表来记录某些系统操作的详细信息.从下面的表格设计中可以看出,除了 Details 之外的每一列都是不可为空的.

I want to create a table in MS SQL Server 2005 to record details of certain system operations. As you can see from the table design below, every column apart from Details is is non nullable.

CREATE TABLE [Log]
(
[LogID] [int] IDENTITY(1,1) NOT NULL,
[ActionID] [int] NOT NULL,
[SystemID] [int] NOT NULL,
[UserID] [int] NOT NULL,
[LoggedOn] [datetime] NOT NULL,
[Details] [varchar](max) NULL
)

因为 Details 列中并不总是包含数据.将此列存储在单独的表中并提供指向它的链接是否更有效?

Because the Details column won't always have data in it. Is it more efficient to store this column in a separate table and provide a link to it instead?

CREATE TABLE [Log]
(
[LogID] [int] IDENTITY(1,1) NOT NULL,
[ActionID] [int] NOT NULL,
[SystemID] [int] NOT NULL,
[UserID] [int] NOT NULL,
[LoggedOn] [datetime] NOT NULL,
[DetailID] [int] NULL
)       

CREATE TABLE [Detail]
(
[DetailID] [int] IDENTITY(1,1) NOT NULL,
[Details] [varchar](max) NOT NULL
)

对于较小的数据类型,我不会真正考虑它,但是对于 varchar(max) 这样做是否有助于保持较小的表大小?或者我只是想使数据库变得聪明而一无所获?

For a smaller data type I wouldn't really consider it, but for a varchar(max) does doing this help keep the table size smaller? Or I am just trying to out smart the database and achieving nothing?

推荐答案

保持内联.自 SQL 2005 以来,SQL Server 已将 MAX 列存储在单独的分配单元"中.请参阅 表和索引组织.这实际上与将 MAX 列保留在其自己的表中完全相同,但是没有显式这样做的任何缺点.

Keep it inline. Under the covers SQL Server already stores the MAX columns in a separate 'allocation unit' since SQL 2005. See Table and Index Organization. This in effect is exactly the same as keeping the MAX column in its own table, but w/o any disadvantage of explicitly doing so.

拥有一个显式表实际上会(因为外键约束)和消耗更多空间(因为 DetaiID 重复).更何况它需要更多的代码,而且bug是由……写代码引入的.

Having an explicit table would actually be both slower (because of the foreign key constraint) and consume more space (because of the DetaiID duplication). Not to mention that it requires more code, and bugs are introduced by... writing code.

alt文本 http://i.msdn.microsoft.com/ms189051.3be61595-d405-4b30-9794-755842d7db7e(en-us,SQL.100).gif

更新

要检查数据的实际位置,一个简单的测试就可以显示:

To check the actual location of data, a simple test can show it:

use tempdb;
go

create table a (
  id int identity(1,1) not null primary key,
  v_a varchar(8000),
  nv_a nvarchar(4000),
  m_a varchar(max),
  nm_a nvarchar(max),
  t text,
  nt ntext);
go

insert into a (v_a, nv_a, m_a, nm_a, t, nt)
values ('v_a', N'nv_a', 'm_a', N'nm_a', 't', N'nt');
go

select %%physloc%%,* from a
go

%%physloc%% 伪列将显示该行的实际物理位置,在我的例子中是第 200 页:

The %%physloc%% pseudo column will show the actual physical location of the row, in my case it was page 200:

dbcc traceon(3604)
dbcc page(2,1, 200, 3)

Slot 0 Column 2 Offset 0x19 Length 3 Length (physical) 3
v_a = v_a                            
Slot 0 Column 3 Offset 0x1c Length 8 Length (physical) 8
nv_a = nv_a                          
m_a = [BLOB Inline Data] Slot 0 Column 4 Offset 0x24 Length 3 Length (physical) 3
m_a = 0x6d5f61                       
nm_a = [BLOB Inline Data] Slot 0 Column 5 Offset 0x27 Length 8 Length (physical) 8
nm_a = 0x6e006d005f006100            
t = [Textpointer] Slot 0 Column 6 Offset 0x2f Length 16 Length (physical) 16
TextTimeStamp = 131137536            RowId = (1:182:0)                    
nt = [Textpointer] Slot 0 Column 7 Offset 0x3f Length 16 Length (physical) 16
TextTimeStamp = 131203072            RowId = (1:182:1)   

除 TEXT 和 NTEXT 之外的所有列值都内联存储,包括 MAX 类型.
更改表选项并插入新行(sp_tableoption 不影响现有行)后,MAX 类型被驱逐到自己的存储中:

All column values but the TEXT and NTEXT were stored inline, including the MAX types.
After changing the table options and insert a new row (sp_tableoption does not affect existing rows), the MAX types were evicted into their own storage:

sp_tableoption 'a' , 'large value types out of row', '1';
insert into a (v_a, nv_a, m_a, nm_a, t, nt)
values ('2v_a', N'2nv_a', '2m_a', N'2nm_a', '2t', N'2nt');    
dbcc page(2,1, 200, 3);

注意 m_a 和 nm_a 列现在如何成为 LOB 分配单元的文本指针:

Note how m_a and nm_a columns are now a Textpointer into the LOB allocation unit:

Slot 1 Column 2 Offset 0x19 Length 4 Length (physical) 4
v_a = 2v_a                           
Slot 1 Column 3 Offset 0x1d Length 10 Length (physical) 10
nv_a = 2nv_a                         
m_a = [Textpointer] Slot 1 Column 4 Offset 0x27 Length 16 Length (physical) 16
TextTimeStamp = 131268608            RowId = (1:182:2)                    
nm_a = [Textpointer] Slot 1 Column 5 Offset 0x37 Length 16 Length (physical) 16
TextTimeStamp = 131334144            RowId = (1:182:3)                    
t = [Textpointer] Slot 1 Column 6 Offset 0x47 Length 16 Length (physical) 16
TextTimeStamp = 131399680            RowId = (1:182:4)                    
nt = [Textpointer] Slot 1 Column 7 Offset 0x57 Length 16 Length (physical) 16
TextTimeStamp = 131465216            RowId = (1:182:5)                    

为了完成起见,我们还可以将非最大字段之一强制排外:

For completion sakeness we can also force the one of the non-max fields out of row:

update a set v_a = replicate('X', 8000);
dbcc page(2,1, 200, 3);

注意 v_a 列是如何存储在 Row-Overflow 存储中的:

Note how the v_a column is stored in the Row-Overflow storage:

Slot 0 Column 1 Offset 0x4 Length 4 Length (physical) 4
v_a = [BLOB Inline Root] Slot 0 Column 2 Offset 0x19 Length 24 Length (physical) 24
Level = 0                            Unused = 99                          UpdateSeq = 1
TimeStamp = 1098383360               
Link 0
Size = 8000                          RowId = (1:176:0) 

因此,正如其他人已经评论过的那样,默认情况下 MAX 类型存储在行内,如果它们合适的话.对于许多 DW 项目,这是不可接受的,因为典型的 DW 加载必须扫描或至少范围扫描,因此应该使用 sp_tableoption ..., 'large value types out of row', '1'.请注意,这不会影响现有行,在我的测试中甚至在索引重建时都不会,因此必须尽早打开该选项.

So, as other have already commented, the MAX types are stored inline by default, if they fit. For many DW projects this would be unnacceptable because the typical DW loads must scan or at least range scan, so the sp_tableoption ..., 'large value types out of row', '1' should be used. Note that this does not affect existing rows, in my test not even on index rebuild, so the option has to be turned on early.

对于大多数 OLTP 类型的加载,尽管 MAX 类型尽可能内联存储实际上是一个优势,因为 OLTP 访问模式是查找,而行宽对其影响很小.

For most OLTP type loads though the fact that MAX types are stored inline if possible is actually an advantage, since the OLTP access pattern is to seek and the row width makes little impact on it.

尽管如此,关于原始问题:不需要单独的表格.启用行外大值类型选项可在开发/测试免费成本的情况下获得相同的结果.

None the less, regarding the original question: separate table is not necessary. Turning on the large value types out of row option achieves the same result at a free cost for development/test.

这篇关于我应该使用内联 varchar(max) 列还是将其存储在单独的表中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆