SSIS 多表加载 [英] SSIS multiple table loads

查看:44
本文介绍了SSIS 多表加载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是第一次学习 SSIS.我正在尝试将多个表从我的临时数据库加载到数据仓库中的多个表.我的临时数据库只不过是一个数据库,其中所有源数据库中的所有表都复制到其中.我发现每个数据流任务只允许将一个源表连接到一个目标表.我是否必须使用与源数据库中的表数量一样多的数据流任务?有没有更简单有效的方法来做到这一点?

I am learning SSIS for the first time. I am trying to load multiple tables from my staging database to multiple tables in the data warehouse. My staging database is nothing but a database with all the tables from all my source databases copied to it. I found that each data flow task only allows to connect one source table to one destination table. Do I have to use as many data flow tasks as the number of tables I have in my source database? Is there a simple and more efficient way to do this?

根据我的理解,我需要一个在 SQL Server 中创建的带有空 Dim 和 Fact 表的空数据库,作为目标数据仓库并从临时数据库接收数据.因此我已经创建了这个空的数据仓库.请帮忙.

From my understanding, I would need an empty database with empty Dim and Fact tables created in SQL Server which would act as destination data warehouse and receive the data from the staging database. Therefore I have already created this empty data warehouse. Please help.

推荐答案

没有约定要求在 DFT(数据流任务)中只需要 1 个源和 1 个目标.

There is no convention that you need to have only 1 source and 1 destination in a DFT (Data Flow Task).

您可以在 DFT 中拥有多个源组件和目标组件.但是,如果单个 DFT 中有太多源到目标组件,则可能会耗尽内存.在预执行阶段,SSIS 必须决定为各个 S2D 组件分配内存,如果内存消耗过高,则服务器可能无法为所有组件分配足够的内存.这可能会导致长时间等待.

You can have multiple source and destination components inside a DFT. However, you may run out of memory if you have too many source to destination components in a single DFT. In the pre-execute phase SSIS has to decide the memory to allocate for individual S2D components, and if the memory consumption is too high then server may not able to grant enough memory for all the components. This may result in long waiting.

然而,DFT 有一个名为 EngineThreads 的属性,它定义了数据流引擎可以创建和并行运行的线程数.根据 MSDN:

However there is a property called EngineThreads for DFT which defines how many threads the data flow engine can create and run in parallel. According to MSDN:

要了解此属性的工作原理,请考虑包含三个的示例包数据流任务.每个数据流任务包含十个源到目的地执行树.如果在每个数据流任务上将 EngineThreads 设置为 10,所有 30 个执行树都可能同时运行. 然而,一般规则是并行运行的线程数不要超过可用线程数处理器.运行的线程数超过可用处理器的数量由于线程之间频繁的上下文切换,可能会影响性能.

To understand how this property works, consider the sample package with three Data Flow tasks. Each of Data Flow task contains ten source-to-destination execution trees. If you set EngineThreads to 10 on each Data Flow task, all 30 execution trees can potentially run simultaneously.However, the general rule is not to run more threads in parallel than the number of available processors. Running more threads than the number of available processors can hinder performance because of the frequent context-switching between threads.

因此最好创建单独的数据流任务并执行操作.

So it's better to create individual Data Flow Task and perform the operation.

注意:如果您从多个来源获取数据,那么您可以在单个数据流任务中完成.

Note: If you are getting data from multiple sources, then you can do it in a single data flow task.

这篇关于SSIS 多表加载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆