具有较少列的平面文件连接的 SSIS pkg 将失败 [英] SSIS pkg with flat-file connection with fewer columns will fail

查看:24
本文介绍了具有较少列的平面文件连接的 SSIS pkg 将失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设有一个平面文件 F1.txt、列 MyCol1 和一个包 Pkg1 来将所述文件加载到 SQL 服务器.

Assume a flat file F1.txt, Column MyCol1 and a package Pkg1 to load said file to SQL server.

没问题吧?对.

现在假设一个平面文件 F2.txt、列 MyCol1、MyCol2 和相同的包 Pkg1 将所述文件加载到 SQL 服务器.

Now assume a flat file F2.txt, Columns MyCol1, MyCol2 and the same package Pkg1 to load said file to SQL server.

我们将对 Pkg1 和 presto 进行一些调整 - 它像梦一样加载 F2.txt.

We'll make a few adjustments to Pkg1 and presto - it loads F2.txt like a dream.

现在我们给它提供 F1.txt,这就是事情恶化的地方.

Now we feed it F1.txt and that's where things deteriorate.

顺便说一句,这并不局限于平面文件,而是具有更普遍的性质.

BTW, this does not confine itself to flat-files but is of a more general nature.

欢迎就如何在同一包中运行遗留数据提出任何建议.

Any and all suggestions on how to run legacy-data within the same package are welcome.

TIA

彼得

推荐答案

看起来你在这里有两个问题.首先是了解如何使用连接管理器.对于平面文件输入,通过为每个文件布局创建一个连接管理器,您通常会得到更好的服务.文件 1 看起来像 (Column1) 文件 2 看起来像 (Column1, Column2)?这意味着需要定义 2 个不同的平面文件连接管理器.

It reads like you have two problems here. The first is understanding how to use Connection Managers. For flat file inputs, you are generally going to be better served by creating a connection manager per file layout. File 1 looks like (Column1) and File 2 looks like (Column1, Column2)? That means 2 different Flat File Connection managers need to be defined.

如果您有 2 个版本的文件 2,一个版本的 Column1 包含数字,另一个版本的 Column1 包含字符数据,则需要 2 个唯一的连接管理器(总共 3 个).

If you have 2 version of File 2, one where Column1 has numbers and another with Column1 containing character data, those would require 2 unique connection managers (3 in total).

与上述相关的好消息是文件名更改是微不足道的,不需要创建唯一的连接管理器.F1.txt、F1_20120501.txt、F1.good.txt 等都将由您为该布局定义的连接管理器提供服务.您只需在给定连接管理器的 ConnectionString 属性上使用表达式即可在运行时更新当前包.

The good news relative to the above is that file names changes are trivial and do not require a unique Connection Manager to be created. F1.txt, F1_20120501.txt, F1.good.txt, etc would all be served by the Connection Manager you have defined for that layout. You would simply need to use an expression on the ConnectionString property of a given Connection Manager to update the current package at run-time.

现在您拥有所有这些平面文件连接管理器,您需要使用它们.这种魔法发生在数据流任务中.数据流对其中使用的元数据非常挑剔.当您设计数据流时,您正在与 SSIS 签订合同,如果您试图通过将字符字段放入日期字段或不提供所有列来违反它,则该包将无法通过验证检查,因为您没有持有讨价还价.解决方案是,您将再次需要围绕您的包所需的各种连接管理器定义多个数据流.

So now that you have all these Flat File Connection Managers, you need to use them. That magic happens in the Data Flow Tasks. A dataflow is real persnickety about the metadata used in it. When you are designing a data flow, you are making a contract with SSIS and if you try to violate it by making a character field into a date field or not providing all the columns, the package will fail validation checks as you aren't holding up your end of the bargain. The resolution to this is that you're again going to need to define multiple data flows around the various Connection Managers your packages need.

定义了所有这些之后,您只需要一个协调器来查看源文件以确定应该执行哪个数据流.我提供了一个关于这个问题的例子 创建 SSIS 包以从多个数据源之一导入

With all that defined, you would simply need a coordinator to look at source files to determine which data flow should be executed. I provided an example on this question Create SSIS package to import from one of many data sources

还有一个类似的问题,我提出了一个可能感兴趣的解决方案 用于不一致列计数导入的 SSIS 任务? 这实际上取决于您的处理规则.

There was also a similar question where I proposed a solution that may be of interest SSIS Task for inconsistent column count import? It really depends what your rules are for processing.

如果您尝试在 SSIS 包中合并/重用业务逻辑,那么我会考虑使用各种数据流将离散源暂存为单一数据存储事物的方法(原始文件、带有大量空值的暂存表)列等).

If you are trying to consolidate/reuse business logic in your SSIS packages, then I would look an approach of using the various dataflows to stage the discrete sources into a singular data storage thing (raw file, staging table with lots of null columns, etc).

这篇关于具有较少列的平面文件连接的 SSIS pkg 将失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆