SSIS 存储过程使用临时表 2008 和 2014 [英] SSIS Stored Procedure uses Temp Table 2008 and 2014

查看:28
本文介绍了SSIS 存储过程使用临时表 2008 和 2014的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在编写一个 SSIS 包,该包通过 OLE DB 源从存储过程中检索数据.存储过程包含一个相当讨厌的查询,我已经能够通过使用临时表来改进它.如果我将这些临时表切换到表变量,逻辑读取从大约 130 万跳到大约 5600 万.我对 130 万次感到不舒服,但我无法对 5600 万次逻辑读取感到满意.因此,我无法真正将临时表转换为表变量.

I'm currently writing an SSIS package that retrieves data from a stored procedure via an OLE DB Source. The stored procedure contains a rather nasty query that I've been able to improve with the use of temp tables. If I switch these temp tables to table variables, the logical reads jump from about 1.3 million to about 56 million. I'm uncomfortable enough with the 1.3 million, but there is no way that I can be satisfied with the 56 million logical reads. Thus, I can't really convert the temp tables to table variables.

但是,SSIS(或 SQL Server)无法解析此查询的元数据,因此程序包将无法运行.我在网上找到了一些不同的解决方案,但它们似乎都不适用于 SQL Server 2008 和 SQL Server 2014.我们目前正在将所有服务器升级到 2014,这个特定的包在 2008 中运行DEV,2014 年在 QA,2008 年在生产中.到秋季,PROD 层将是 2014 年,而 DEV 层将在此之后的某个时间升级.不幸的是,我等不及这些升级发生时发布这个 SSIS 包.数据需要在下周开始移动.因此,我需要找出一种方法来为两种环境解析元数据.这是我迄今为止尝试过的:

However, SSIS (or rather SQL Server) can't resolve the metadata for this query, so the package won't run. I've found a few different solutions online, but none of them seem to work for both SQL Server 2008 and SQL Server 2014. We are currently in the process of upgrading all of our servers to 2014, and this particular package runs against 2008 in DEV, 2014 in QA, and 2008 in production currently. By the fall, the PROD tier will be 2014, and the DEV tier will be promoted sometime after that. Unfortunately, I can't wait until these upgrades happen to release this SSIS package. The data needs to start moving by next week. Thus, I need to figure out a way to get the metadata resolved for both environments. Here's what I've tried so far:

  1. 在返回正确元数据的 IF 1=0 块中添加一个虚拟选择.这在 2008 年有效,但在 2014 年无效.

  1. Add a dummy select in an IF 1=0 block which returns the proper metadata. This works in 2008, but not 2014.

在存储过程的开头使用SET FMTONLY OFF.这在 2008 年有效,但在 2014 年无效.此外,它会导致存储过程为返回的每一列(在本例中超过 30 个)运行一次,即使它确实有效,这也是一个交易破坏者.

Use SET FMTONLY OFF at the beginning of the stored procedure. This works in 2008, but not 2014. Furthermore, it causes the stored procedure to run once for each column returned (over 30 in this case), which is a deal-breaker even if it did work.

使用 EXEC ... WITH RESULT SETS (( ... ));.这在 2014 年有效,但在 2008 年无效.

Use EXEC ... WITH RESULT SETS (( ... ));. This works in 2014, but not in 2008.

部署返回正确元数据的存储过程,构建并部署 SSIS 包,然后将存储过程修改为正确的版本.这在任何一种环境中似乎都不起作用,而且这会使在我们的 ETL 框架内开发的任何其他 ETL 应用程序复杂化.

Deploy a stored procedure which returns the proper metadata, build and deploy the SSIS package, then modify the stored procedure to the proper version. This hasn't seemed to work in either environment, and this would complicate of any other ETL applications developed within our ETL framework.

如果我什么都不知道,我可以将不同的存储过程和包部署到不同的层,但我更倾向于反对这种做法.一方面,这会使未来的版本复杂化,而且我还需要确保在升级服务器后不会忘记更新存储过程和包.

If I can't figure anything out, I could either deploy different stored procedures and packages to the different tiers, but I would very much prefer against this. For one, this would complicate future releases, and I would also need to ensure that I don't forget about updating the stored procedure and package once we upgrade the servers.

我还可以在数据库中制作真正的表来代替这些临时表.我不太喜欢这个解决方案,但这是我可以容忍的.如果我最终这样做,我将来可能会改用 WITH RESULT SETS.

I could also make real tables in the database which would take the place of these temp tables. I don't really like this solution, but it's something that I could tolerate. If I end up doing this, I would probably switch to using the WITH RESULT SETS in the future.

但是,我个人对这两种解决方案都不太关心,所以我想知道是否有任何我错过的解决方法可能会更好.

However, I personally don't care much for either of these solutions, so I was wondering if there is any workaround that I missed that might work a bit better.

推荐答案

尽管您很不情愿,但我认为您做出了正确的选择,并且专用的集结区是正确的选择.我使用过的大多数生产 ETL 都有一个专用的临时数据库,更不用说表了.然后,您可以更明确地控制存储,这使得性能更可靠,并且整个事情通常更易于维护.例如,您可以为这些具有自己的文件组等的表创建一个专用的连续快速磁盘空间块.我当然宁愿看到 2 个独立的 SP 依赖于几个物理表,而不是一个真正粗糙的单个.

Despite your reluctance, I think you've made the right choice and a dedicated staging area is the right way to go. Most of the production ETLs I've worked with have a dedicated staging database, never mind tables. You then have the benefit of being able to control the storage more explicitly, which makes performance more reliable and the whole thing generally more maintainable. For example, you can create a dedicated contiguous block of fast disk space for these tables with their own file group etc. I'd certainly rather see 2 separate SPs relying on a few physical tables than a really gnarly single one.

也就是说,在不知道任何细节的情况下,这只是我的经验,因此对未来读者的警告:与所有数据库一样,请务必衡量场景的实际性能(之前和之后),而不是基于任何假设关于查询计划 - 它可能会误导您.

That said, without knowing any specifics this is just my experience, so a caveat for future readers: As with all things database, be sure to measure the actual performance of your scenario (before and after) rather than making any assumptions based on the query plan - it might be misleading you.

这篇关于SSIS 存储过程使用临时表 2008 和 2014的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆