如何将具有相同名称和模式但不同目录的文本文件导入数据库? [英] How to import text files with the same name and schema but different directories into database?

查看:229
本文介绍了如何将具有相同名称和模式但不同目录的文本文件导入数据库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将具有相同名称和相同模式的多个txt文件导入SQL Server 2008数据库的同一个表中。我遇到的问题是它们都在不同的目录中:

  TEST 
201304
sample1。 txt
sample2.txt
201305
sample1.txt
sample2.txt
201306
sample1.txt
sample2.txt

SSIS中有什么方法我可以设置它吗?

解决方案

是的。您将需要使用 Foreach文件容器,然后检查Traverse子文件夹选项。



编辑



显然我的答案不够完整,所以请接受这个有用的代码说明我简短的原始答案说明了。



源数据



我创建了3个文件夹,如上所述包含文件 sample1.txt sample2.txt

  C:\> MKDIR SSISDATA\SO\TEST\201304 
C:\> MKDIR SSISDATA\SO\TEST\201305
C: \> MKDIR SSISDATA\SO\TEST \ 201306

该文件的内容是下面。每个文件夹中的每个文件版本都会增加ID值以及更改的文本值,以证明它已获取新文件。

  ID,价值
1,ABC



包裹生成



此部分假设您已安装 BIDS Helper 。它不是解决方案所必需的,而是简单地提供了未来读者可以用来重现此解决方案的通用框架



我创建了一个包含以下内容的BIML文件。即使我有表创建步骤,我需要在生成包之前在目标服务器上运行。

 < Biml xmlns =http://schemas.varigence.com/biml.xsd > 
<! - 创建基本的平面文件源定义 - >
< FileFormats>
< FlatFileFormat
Name =FFFSrc
CodePage =1252
RowDelimiter =CRLF
IsUnicode =false
FlatFileType =定界
ColumnNamesInFirstDataRow =true
>
< Columns>
< Column
Name =ID
DataType =Int32
Delimiter =,
ColumnType =Delimited
/>
< Column
Name =value
DataType =AnsiString
Delimiter =CRLF
InputLength =20
MaximumWidth = 20
长度=20
CodePage =1252
ColumnType =delimited
/>
< / Columns>
< / FlatFileFormat>
< / FileFormats>

<! - 创建使用上面定义的平面文件格式的连接 - >
< Connections>
< FlatFileConnection
Name =FFSrc
FileFormat =FFFSrc
FilePath =C:\ssisdata\so\TEST\201306 \sample1。 txt
DelayValidation =true
/>
< OleDbConnection
Name =tempdb
ConnectionString =Data Source = localhost\dev2012; Initial Catalog = tempdb; Provider = SQLNCLI11.1; Integrated Security = SSPI; Auto Translate =假;
/>

< / Connections>

<! - 创建一个包来说明如何在Connection Manager上应用表达式 - >
< Packages>
< Package
Name =so_19957451
ConstraintMode =Linear
>
< Connections>
< Connection ConnectionName =tempdb/>
< Connection ConnectionName =FFSrc>
<表达式>
<! - 为ConnectionString属性分配变量。
这个语法是ConnectionManagerName.Property - >
< Expression PropertyName =FFSrc.ConnectionString> @ [User :: CurrentFileName]< / Expression>
< / Expressions>
< / Connection>
< / Connections>

<! - 创建一个指向当前文件的变量 - >
<变量>
< Variable Name =CurrentFileNameDataType =String> C:\\\ n \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
< Variable Name =FileMaskDataType =String> * .txt< / Variable>
< Variable Name =SourceFolderDataType =String> C:\ desktopdata \ so\TEST< / Variable>
< Variable Name =RowCountInputDataType =Int32> 0< / Variable>
< Variable Name =TargetTableDataType =String> [dbo]。[so_19957451]< / Variable>
< /变量>

<! - 添加一个foreach文件枚举器。使用上面的 - >
<任务>
< ExecuteSQL
Name =SQL Create Table
ConnectionName =tempdb>
< DirectInput>
IF NOT NOT EXISTS(SELECT * FROM sys.tables T WHERE T.name ='so_19957451'和T.schema_id = schema_id('dbo'))
BEGIN
CREATE TABLE dbo.so_19957451( ID int NOT NULL,value varchar(20)NOT NULL);
END
< / DirectInput>
< / ExecuteSQL>
< ForEachFileLoop
Name =FELC Consume files
FileSpecification =*。csv
ProcessSubfolders =true
RetrieveFileNameFormat =FullyQualified
Folder =C:\
ConstraintMode =Linear
>
<! - 定义表达式以使输入文件夹和文件掩码
由变量值驱动 - >
<表达式>
< Expression PropertyName =Directory> @ [User :: SourceFolder]< / Expression>
< Expression PropertyName =FileSpec> @ [User :: FileMask]< / Expression>
< / Expressions>
< VariableMappings>
<! - 注意我们在这里使用User.Variable名称的约定 - >
< VariableMapping
Name =0
VariableName =User.CurrentFileName
/>
< / VariableMappings>
<任务>
< Dataflow Name =DFT Import fileDelayValidation =true>
<转换>
< FlatFileSource Name =FFS SampleConnectionName =FFSrc/>
< RowCount Name =RC SourceVariableName =User.RowCountInput/>
< OleDbDestination
Name =OLE_DST
ConnectionName =tempdb>
< TableFromVariableOutput VariableName =User.TargetTable/>
< / OleDbDestination>
< /转换>
< / Dataflow>
< / Tasks>
< / ForEachFileLoop>
< / Tasks>
< / Package>
< / Packages>
< / Biml>

右键单击biml文件并选择生成SSIS包。此时,您应该将一个名为so_19957451的包添加到当前的SSIS项目中。



包配置



没有任何配置,因为它已经通过BIML完成,但是moar截图可以获得更好的答案。



这是基本套餐





此处是我的变量





Foreach循环的配置,如MSDN文章中所述,以及我选择Traverse子文件夹的说明





将每个循环生成的值分配给变量当前





平面文件源的表达式应用于t他使用ConnectionString属性来确保它使用Variable @User :: CurrentFileName。这会更改每次执行循环的源。





执行结果



数据库结果





匹配包执行的输出


信息:0x402090DC在DFT导入文件,FFS示例[2]:处理文件C :\ chassisdata\so\TEST \201304 \sample1.txt已经开始。



信息:DFT导入文件中的0x402090DD,FFS示例[2] :文件C:\ chassisdata\so\TEST \201304 \ sample1.txt的处理已结束。



信息:DFT导入时为0x402090DC文件,FFS示例[2]:已开始处理文件C:\ chassisdata \ so\TEST \201304 \ sample2.txt。



信息:0x DFT导入文件中的402090DD,FFS示例[2]:文件C:\ssisdata\so\TEST \201304 \ sample2.txt的处理已结束。



信息:DFT导入文件中的0x402090DC,FFS示例[2]:已开始处理文件C:\ chassisdata \ so\TEST \ 201105 \ sample1.txt。



信息:DFT导入文件中的0x402090DD,FFS示例[2]:处理文件C:\ssisdata \ so\TEST \201305 \ sample1.txt 已经结束。



信息:DFT导入文件中的0x402090DC,FFS示例[2]:处理文件C:\ chassisdata \ so\TEST \\ \\ _201305 \sample2.txt已经开始。



信息:DFT导入文件中的0x402090DD,FFS示例[2]:处理文件C:\ chassisdata \ so\TEST\201305 \sample2.txt已经结束。



信息:0x402090DC在DFT导入文件,FFS示例[2]:处理文件C:\ssisdata \\TEST\201306 \sample1.txt已经开始。



信息:DFT导入文件中的0x402090DD,FFS示例[2]:文件处理C:\\\ n \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ 2]:已经开始处理文件C:\ chassisdata\so\TEST \ 201106 \ sample2.txt。



信息:0x402090DD at DFT导入文件,FFS示例[2]:文件C:\ssisdata\so\TEST \201306 \ sample2.txt的处理已结束。



I require to import multiple txt files with the same name and same schemas into the same table in SQL Server 2008 database. The problem that I have is that they are all in different directories:

TEST
     201304
            sample1.txt
            sample2.txt
     201305
            sample1.txt
            sample2.txt
     201306
            sample1.txt
            sample2.txt

Is there any way in SSIS that I can set this up?

解决方案

Yes. You will want to use a Foreach File Container and then check the Traverse Subfolder option.

Edit

Apparently my answer wasn't cromulent enough, so please accept this working code which illustrates what my brief original answer stated.

Source data

I created 3 folders as described above to contain files sample1.txt and sample2.txt

C:\>MKDIR SSISDATA\SO\TEST\201304
C:\>MKDIR SSISDATA\SO\TEST\201305
C:\>MKDIR SSISDATA\SO\TEST\201306

The contents of the file are below. Each version of the file in each folder has the ID value incremented along with the text values altered to prove it has picked up the new file.

ID,value
1,ABC

Package generation

This part assumes you have BIDS Helper installed. It is not required for the solution but simply provides a common framework future readers could use to reproduce this solution

I created a BIML file with the following content. Even though I have the table create step in there, I needed to have that run on the target server prior to generating the package.

<Biml xmlns="http://schemas.varigence.com/biml.xsd">
    <!-- Create a basic flat file source definition -->
    <FileFormats>
        <FlatFileFormat
            Name="FFFSrc"
            CodePage="1252"
            RowDelimiter="CRLF"
            IsUnicode="false"
            FlatFileType="Delimited"
            ColumnNamesInFirstDataRow="true"
        >
            <Columns>
                <Column
                    Name="ID"
                    DataType="Int32"
                    Delimiter=","
                    ColumnType="Delimited"
                />
                <Column
                    Name="value"
                    DataType="AnsiString"
                    Delimiter="CRLF"
                    InputLength="20"
                    MaximumWidth="20"
                    Length="20"
                    CodePage="1252"
                    ColumnType="Delimited"
                    />
            </Columns>
        </FlatFileFormat>
    </FileFormats>

    <!-- Create a connection that uses the flat file format defined above-->
    <Connections>
        <FlatFileConnection
            Name="FFSrc"
            FileFormat="FFFSrc"
            FilePath="C:\ssisdata\so\TEST\201306\sample1.txt"
            DelayValidation="true"
        />
        <OleDbConnection
            Name="tempdb"
            ConnectionString="Data Source=localhost\dev2012;Initial Catalog=tempdb;Provider=SQLNCLI11.1;Integrated Security=SSPI;Auto Translate=False;"
        />

    </Connections>

    <!-- Create a package to illustrate how to apply an expression on the Connection Manager -->
    <Packages>
        <Package
            Name="so_19957451"
            ConstraintMode="Linear"
        >
            <Connections>
                <Connection ConnectionName="tempdb"/>
                <Connection ConnectionName="FFSrc">
                    <Expressions>
                        <!-- Assign a variable to the ConnectionString property. 
                        The syntax for this is ConnectionManagerName.Property -->
                        <Expression PropertyName="FFSrc.ConnectionString">@[User::CurrentFileName]</Expression>
                    </Expressions>
                </Connection>
            </Connections>

            <!-- Create a single variable that points to the current file -->
            <Variables>
                <Variable Name="CurrentFileName" DataType="String">C:\ssisdata\so\TEST\201306\sample1.txt</Variable>
                <Variable Name="FileMask" DataType="String">*.txt</Variable>
                <Variable Name="SourceFolder" DataType="String">C:\ssisdata\so\TEST</Variable>
                <Variable Name="RowCountInput" DataType="Int32">0</Variable>
                <Variable Name="TargetTable" DataType="String">[dbo].[so_19957451]</Variable>
            </Variables>

            <!-- Add a foreach file enumerator. Use the above -->
            <Tasks>
                <ExecuteSQL 
                    Name="SQL Create Table"
                    ConnectionName="tempdb">
                    <DirectInput>
                        IF NOT EXISTS (SELECT * FROM sys.tables T WHERE T.name = 'so_19957451' and T.schema_id = schema_id('dbo'))
                        BEGIN
                            CREATE TABLE dbo.so_19957451(ID int NOT NULL, value varchar(20) NOT NULL);
                        END
                    </DirectInput>
                </ExecuteSQL>
                <ForEachFileLoop
                    Name="FELC Consume files"
                    FileSpecification="*.csv"
                    ProcessSubfolders="true"
                    RetrieveFileNameFormat="FullyQualified"
                    Folder="C:\"
                    ConstraintMode="Linear"
                >
                    <!-- Define the expressions to make the input folder and the file mask 
                    driven by variable values -->
                    <Expressions>
                        <Expression PropertyName="Directory">@[User::SourceFolder]</Expression>
                        <Expression PropertyName="FileSpec">@[User::FileMask]</Expression>
                    </Expressions>
                    <VariableMappings>
                        <!-- Notice that we use the convention of User.Variable name here -->
                        <VariableMapping
                            Name="0"
                            VariableName="User.CurrentFileName"
                        />
                    </VariableMappings>
                    <Tasks>
                        <Dataflow Name="DFT Import file" DelayValidation="true">
                            <Transformations>
                                <FlatFileSource Name="FFS Sample" ConnectionName="FFSrc"/>
                                <RowCount Name="RC Source" VariableName="User.RowCountInput"/>
                                <OleDbDestination 
                                    Name="OLE_DST"
                                    ConnectionName="tempdb">
                                    <TableFromVariableOutput VariableName="User.TargetTable"/>                                  
                                </OleDbDestination>
                            </Transformations>
                        </Dataflow>
                    </Tasks>
                </ForEachFileLoop>
            </Tasks>
        </Package>
    </Packages>
</Biml>

Right click on the biml file and select Generate SSIS Package. At this point, you should have a package named so_19957451 added to your current SSIS project.

Package configuration

There's no need for any configuration because it's already been done via BIML but moar screenshots make for better answers.

This is the basic package

Here are my variables

Configuration of the Foreach Loop, as called out in the MSDN article as well as my note of select the Traverse subfolder

Assign the value generated per loop to the variable Current

The flat file source has an expression applied to the ConnectionString property to ensure it uses the Variable @User::CurrentFileName. This changes the source per execution of the loop.

Execution results

Results from the database

Match the output from the package execution

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample1.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample1.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample2.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201304\sample2.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample1.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample1.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample2.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201305\sample2.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample1.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample1.txt" has ended.

Information: 0x402090DC at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample2.txt" has started.

Information: 0x402090DD at DFT Import file, FFS Sample [2]: The processing of file "C:\ssisdata\so\TEST\201306\sample2.txt" has ended.

这篇关于如何将具有相同名称和模式但不同目录的文本文件导入数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆