具有小数据的 varchar(max) 列的开销 [英] overhead of varchar(max) columns with small data

查看:26
本文介绍了具有小数据的 varchar(max) 列的开销的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为从外部源批量加载数据的一部分,暂存表是用 varchar(max) 列定义的.这个想法是,每一列都能够保存它在源 CSV 文件中找到的任何内容,我们稍后将验证数据(类型、大小、精度等).

As part of a bulk load of data from an external source the stageing table is defined with varchar(max) columns. The idea being that each column will be able to hold whatever it finds in the source CSV file, and that we'll validate the data (for type, size, percision etc) later.

但我担心 varchar(max) 列对于少于 200 个字符的列有很多开销.设计这个的人向我保证这是 ETL 的最佳实践,但我想我会在社区中验证这个断言.

But I'm concerned that the varchar(max) column has a lot of overhead for columns that have less than 200 characters. The fellow that designed this assures me this is best practice for ETL but I thought I would validate that assertion with the community.

推荐答案

VARCHAR(MAX) 列值将存储在表行中,空间允许.因此,如果您有一个 VARCHAR(MAX) 字段并且它是 200、300 字节,那么它很可能会与其余数据一起存储.这里没有问题或额外的开销.

VARCHAR(MAX) column values will be stored IN the table row, space permitting. So if you have a single VARCHAR(MAX) field and it's 200, 300 byte, chances are it'll be stored inline with the rest of your data. No problem or additional overhead here.

只有当单个行的全部数据不能再放在单个 SQL Server 页 (8K) 上时,SQL Server 才会将 VARCHAR(MAX) 数据移动到溢出页中.

Only when the entire data of a single row cannot fit on a single SQL Server page (8K) anymore, only then will SQL Server move VARCHAR(MAX) data into overflow pages.

总而言之,我认为您可以两全其美 - 尽可能内联存储,必要时溢出存储.

So all in all, I think you get the best of both worlds - inline storage when possible, overflow storage when necessary.

马克

PS:正如 Mitch 指出的那样,可以关闭此默认行为 - 我没有看到任何令人信服的理由这样做,但是....

PS: As Mitch points out, this default behaviour can be turned off - I don't see any compelling reasons to do so, however....

这篇关于具有小数据的 varchar(max) 列的开销的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆