SSIS任务不一致的列数的进口? [英] SSIS Task for inconsistent column count import?

查看:338
本文介绍了SSIS任务不一致的列数的进口?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题。

我经常从不同的供应商收到饲料的文件。虽然列名一致问题是当一些供应商发送的文本文件,在那里供养文件或多或少列。

I regularly receive a feed files from different suppliers. Although the column names are consistent the problem comes when some suppliers send text files with more or less columns in there feed file.

此外,这些文件的排列不一致。

Furthermore the arrangement of these files are inconsistent.

除了用舒适大鹏所提供的动态数据流任务有另一种方式,我可以导入这些文件。我不是一个C#大师,但我驱动torwards使用脚本任务控制流或者脚本组件数据流任务。

Other than the Dynamic data flow task provided by Cozy Roc is there another way I could import these files. I am not a C# guru but i am driven torwards using a "Script Task" control flow or "Script Component" Data flow task.

任何建议,样品或方向将大大AP preciated。

Any suggestion, samples or direction will greatly be appreciated.

http://www.cozyroc.com/ssis/data-flow-task

有些论坛

<一个href=\"http://www.sqlservercentral.com/Forums/Topic525799-148-1.aspx#bm526400\">http://www.sqlservercentral.com/Forums/Topic525799-148-1.aspx#bm526400

<一个href=\"http://www.bidn.com/forums/microsoft-business-intelligence/integration-services/26/dynamic-data-flow\">http://www.bidn.com/forums/microsoft-business-intelligence/integration-services/26/dynamic-data-flow

推荐答案

关闭我的头顶,我给你50%的溶液。

Off the top of my head, I have a 50% solution for you.

SSIS的真正的关心元数据,因此它的变化往往会导致异常。 DTS是远在这个意义上更多的宽容。实现一致的元数据非常需要利用平面文件源的麻烦。

SSIS really cares about meta data so variations in it tend to result in exceptions. DTS was far more forgiving in this sense. That strong need for consistent meta data makes use of the Flat File Source troublesome.

如果问题是组件,让我们不使用它。我喜欢这种方法,在概念上,这是一样的查询列的表的顺序并不重要也没有额外列的presence关系。

If the problem is the component, let's not use it. What I like about this approach is that conceptually, it's the same as querying a table-the order of columns does not matter nor does the presence of extra columns matter.

我创建了3个变量,所有类型的字符串:CurrentFileName,InputFolder和查询

I created 3 variables, all of type string: CurrentFileName, InputFolder and Query.


  • InputFolder是硬连接到源文件夹。在我的例子,这是 C:\\ ssisdata \\淇preAL

  • CurrentFileName是一个文件的名称。在设计时,它是 input5columns.csv ,但将在运行时更改。

  • 查询是一个前pression SELECT COL1,COL2,COL3,COL4,COL5 FROM+ @ [用户:: CurrentFilename]

  • InputFolder is hard wired to the source folder. In my example, it's C:\ssisdata\Kipreal
  • CurrentFileName is the name of a file. During design time, it was input5columns.csv but that will change at run time.
  • Query is an expression "SELECT col1, col2, col3, col4, col5 FROM " + @[User::CurrentFilename]

设置使用 JET OLEDB驱动程序。作为链接的文章中所述创建后,我把它改名为FileOLEDB和数据源=+ @ [用户:: InputFolder] +的的ConnectionManager设置前pression;供应商= Microsoft.Jet.OLEDB.4.0;扩展属性= \\文本; HDR =是; FMT = CSVDelimited; \\;

Set up a connection to the input file using the JET OLEDB driver. After creating it as described in the linked article, I renamed it to FileOLEDB and set an expression on the ConnectionManager of "Data Source=" + @[User::InputFolder] + ";Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties=\"text;HDR=Yes;FMT=CSVDelimited;\";"

我的控制流看起来像嵌套在foreach文件枚举数据流任务

My Control Flow looks like a Data flow task nested in a Foreach file enumerator

我的Foreach文件枚举器配置为上的文件进行操作。我把目录中的前pression为 @ [用户:: InputFolder] 注意,在这一点上,如果该文件夹的值需要改变,这是会正确地在两个连接管理器和文件枚举进行更新。在检索文件名,而不是默认的完全合格,选择名和扩展名

My Foreach File enumerator is configured to operate on files. I put an expression on the Directory for @[User::InputFolder] Notice that at this point, if the value of that folder needs to change, it'll correctly be updated in both the Connection Manager and the file enumerator. In "Retrieve file name", instead of the default "Fully Qualified", choose "Name and Extension"

在变量映射选项卡,分配价值,我们的 @ [用户:: CurrentFileName] 变量

In the Variable Mappings tab, assign the value to our @[User::CurrentFileName] variable

在这一点上,每次循环会改变的值 @ [用户::查询来反映当前的文件名。

At this point, each iteration of the loop will change the value of the @[User::Query to reflect the current file name.

这其实是最简单的一块。使用OLE DB源,并将其连接如图所示。

This is actually the easiest piece. Use an OLE DB source and wire it as indicated.

使用的FileOLEDB连接管理器,然后更改数据访问模式,从变量SQL命令。使用在那里 @ [用户::查询] 变量,点击确定,你准备工作。

Use the FileOLEDB connection manager and change the Data Access mode to "SQL Command from variable." Use the @[User::Query] variable in there, click OK and you're ready to work.

我创建了两个示例文件input5columns.csv和input7columns.csv所有的5列是在7,但7有它们以不同的顺序(COL2是序号位置2和6)。我抵消了7所有的值,使之容易发现哪些文件被操作的。

I created two sample files input5columns.csv and input7columns.csv All of the columns of 5 are in 7 but 7 has them in a different order (col2 is ordinal position 2 and 6). I negated all the values in 7 to make it readily apparent which file is being operated on.

col1,col3,col2,col5,col4
1,3,2,5,4
1111,3333,2222,5555,4444
11,33,22,55,44
111,333,222,555,444

col1,col3,col7,col5,col4,col6,col2
-1111,-3333,-7777,-5555,-4444,-6666,-2222
-111,-333,-777,-555,-444,-666,-222
-1,-3,-7,-5,-4,-6,-2
-11,-33,-77,-55,-44,-666,-222

运行包结果在这两个屏幕截图

Running the package results in these two screen shots

我不知道的方式来告诉查询基础的方法,它如果列不存在的确定。如果有一个独特的关键,我想你可以定义查询有只列必须在那里,然后执行查找对文件,试图获得的列应在那里,而不是如果列不存在,无法查找。 pretty kludgey虽然。

I don't know of a way to tell the query based approach that it's OK if a column doesn't exist. If there's a unique key, I suppose you could define your query to have only the columns that must be there and then perform lookups against the file to try and obtain the columns that ought to be there and not fail the lookup if the column doesn't exist. Pretty kludgey though.

这篇关于SSIS任务不一致的列数的进口?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆