SSIS与DTS的性能 [英] SSIS vs. DTS performance

查看:146
本文介绍了SSIS与DTS的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎在这么晚的时候这样做很疯狂,但是...

Seems crazy to be doing this at this late date, but...

我正在使用Rocket Software UniVerse源和SQL目标重建一些ETL基础结构.旧的目标平台是Windows Server 2003上的SQL 2000,新的平台是Windows Server 2012上的SQL2012.在两种情况下,都使用ODBC驱动程序连接到源.一切似乎都可以在新平台上正常运行,但包的执行时间却成倍降低.例如,使用SQL 2000/DTS大约有130万行和28列的一张表大约需要一个小时,而使用SQL 2012/SSIS大约需要3.5个小时.这两台SQL服务器都在Xen Server上进行了虚拟化,2012服务器具有更多的RAM和更多的vCPU,这两种计算机在磁盘基础结构方面均没有优势.程序包执行期间,2012年服务器上没有指标(内存,磁盘IO等)出现红线(甚至实际上即将结束).

I am rebuilding some ETL infrastructure with a Rocket Software UniVerse source and an SQL destination. The old destination platform was SQL 2000 on Windows Server 2003, the new platform is SQL 2012 on Windows Server 2012. In both cases, an ODBC driver is used to connect to the source. Everything seems to work fine on the new platform, but the execution time for a package is exponentially slower. For example, one table with roughly 1.3 Million rows and 28 Columns takes about an hour using SQL 2000/DTS and over 3.5 hours using SQL 2012/SSIS. Both SQL servers are virtualized on Xen Server, the 2012 server has more RAM and more vCPUs, neither machine has an advantage in disk infrastructure. No metrics (Memory, disk IO, etc.) are red-lining (or really even coming close) on the 2012 server during package execution.

我已经阅读了多个描述相同场景的论坛帖子,但是似乎没有一个真正适合我的解决方案.由于所有这些帖子都过时了(大多数从DTS到SSIS的转换都发生在SQL 2005的日子里),所以我很好奇是否有任何新鲜的信息.

I have read several forum posts describing the same scenario, but none really seemed to have a solution that works for me. Since all of these posts were quite dated (most of these conversions from DTS to SSIS happened in the SQL 2005 days), I was curious if there was any fresher info out there.

这些包是非常简单的表副本,没有任何转换.我在源连接中使用"SELECT列,列,.. FROM源表",而在目的地中使用表或视图-快速加载".尽管我不确定,但放慢APPEARS会出现在等式的源头上.

The packages are very simple table copies, no transforms. I am using a "SELECT column, column,.. FROM sourcetable" for my source connection and 'Table or View - Fast Load' for my destination. The slow down APPEARS to be on the source side of the equation, though I can't be certain.

任何帮助表示赞赏.

推荐答案

要研究的一个选项是减小数据流中的缓冲区大小.默认情况下,它设置为1万行.如果您的数据源速度较慢,则填充数据的存储桶"可能需要花费相当长的时间,只是为了开始将一批信息发送到目的地.尽管这似乎违反直觉,但降低该数字可以提高性能,因为5k或1k或100行数据可以更快地填充存储桶.然后,数据在数据流中被重新整理,并在填充存储桶2、3等时降落在源中.

One option to investigate is lowering the buffer size in your data flow. By default, it's set at 10k rows. If you have a slow data source, it can take quite a while to fill up the "bucket" of data just to start sending a batch of information down to the destination. While it might seem counterintuitive, lowering that number can increase performance as 5k, or 1k or 100 rows of data fill the bucket much sooner. That data then gets shuffled through the data flow and lands in the source while bucket 2, 3, etc are being filled.

如果您有SQL Server源,则可以通过提示您想要快速的N行来优化查询,该行将与SSIS包的行大小对齐.

If you have a SQL Server source, you can optimize your query by hinting that you'd like a fast N rows, which you'd align with your SSIS package's row size.

请参见罗伯·法利(Rob Farley)的文章,以获取有关此内容的更多详细信息.

See Rob Farley's article for more details about that.

这篇关于SSIS与DTS的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆