将多个数据从excel加载到sql SSIS [英] load multiple data from excel to sql SSIS

查看:118
本文介绍了将多个数据从excel加载到sql SSIS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用SSIS,我需要使用SSIS将具有以下(Yellos)格式的多个文件加载到SQL

I am working with SSIS and I need to load multiple files with the following (Yellos) format to SQL using SSIS

您所看到的问题是,如果填充了列A(例如:忽略行#14),则文件具有仅处理/使用记录的可怕格式。 -X),我需要将D1中的值插入日期列。

The problem as you can see is that the files has an horrible format only process / consume records if the column A is populated (e.g: ignoring rows# 14 - X ) and I need to insert the value in D1 into the Date column.

有什么建议吗?

推荐答案

让我们将此问题分为3个子问题:


  1. D1
  2. 获取日期值
  3. 从第4行开始读取

  4. 忽略Column1为NULL的所有行

  1. Get the date value from D1
  2. Start Reading from Row number 4
  3. Ignore all Rows where Column1 is NULL



解决方案



1。从D1获取日期值

Solution

1. Get the date value from D1


  1. 创建2个SSIS变量, @ [用户:: FilePath] (类型为字符串),其中包含excel文件路径, @ [User :: FileDate] (字符串类型),我们将使用它来存储日期值

  2. 添加脚本任务,选择脚本语言为 Visual Basic em>

  3. 选择 @ [User :: FilePath] 作为 ReadOnly变量 @ [User :: FileDate] 作为 ReadWrite变量

  4. 打开脚本编辑器,然后使用以下代码检索日期值并将其存储到 @ [User :: FileDate]

  1. Create 2 SSIS variables, @[User::FilePath] (of type string) that contains the excel file path, @[User::FileDate] (of type string) that we will use it to store the date value
  2. Add a script Task, choose the script language as Visual Basic
  3. Select @[User::FilePath] as a ReadOnly variable and @[User::FileDate] as a ReadWrite variable
  4. Open the Script Editor and use the following code to retrieve the Date Value and store it into @[User::FileDate]

这将搜索名为退款的工作表,并从中提取日期值,并将该值存储到 @ [User :: FileDate]

This will search for the sheet named Refunds and extract the date value from it and store this value into @[User::FileDate]

    m_strExcelPath = Dts.Variables.Item("FilePath").Value.ToString

    Dim strSheetname As String = String.Empty
    Dim strDate as String = String.Empty

    m_strExcelConnectionString = Me.BuildConnectionString()

    Try


        Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)

            If OleDBCon.State <> ConnectionState.Open Then
                OleDBCon.Open()
            End If

            'Get all WorkSheets
            m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
                                                               New Object() {Nothing, Nothing, Nothing, "TABLE"})

            'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones

            For Each schRow As DataRow In m_dtschemaTable.Rows
                strSheetname = schRow("TABLE_NAME").ToString

                If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then

                If Not strSheetname.Tolower.Contains("refunds") Then Continue For

                    Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "A1:D1]", OleDBCon)

                        Dim dtTable As New DataTable("Table1")


                        cmd.CommandType = CommandType.Text

                        Using daGetDataFromSheet As New OleDbDataAdapter(cmd)

                            daGetDataFromSheet.Fill(dtTable)

                            'Get Value from column 4 (3 because it is a zero-based index
                            strDate = dtTable.Rows(0).Item(3).ToString

                        End Using

                    End Using

                    'when the first correct sheet is found there is no need to check others
                    Exit For

                End If
            Next

            OleDBCon.Close()

        End Using

    Catch ex As Exception
        Throw New Exception(ex.Message, ex)
    End Tr

    Dts.Variables.Item("FileDate").Value = strDate

    Dts.TaskResult = ScriptResults.Success
End Sub




  1. 在DataFlow任务中添加派生列转换,添加具有以下表达式的派生列

  1. In the DataFlow Task add a Derived Column Transformation, add a derived column with the following expression

@[User::FileDate]


2。从第4行开始读取

2. Start Reading from Row Number 4

我们假设Excel文件路径存储在 @中[User :: FilePath]


  1. 首先打开 Excel Connection管理器,并取消选中框第一行具有列名

  2. 在数据流任务中,双击excel source

  3. 将源设置为 SQL命令

  4. 使用以下命令: SELECT * FROM [Refunds $ A4:D] ,因此它将从第4行开始读取

  5. 列名将如下所示F1 ... F4,在excel源代码中,您可以转到Columns选项卡并为列名指定别名,因此在数据流任务中,它们将以其别名显示

  1. First open the Excel Connection Manager and uncheck the box First row has column names
  2. In the DataFlow Task, double click on the excel source
  3. Set the source to SQL Command
  4. Use the following command: SELECT * FROM [Refunds$A4:D] , so it will start reading from the row number 4
  5. Columns names will be as the following F1 ... F4 , in the excel source you can go to the Columns Tab and give alias to the columns names, so in the data flow task they will be showed with their aliases

3。忽略Column1为NULL的所有行

3. Ignore all Rows Where Column1 is NULL


  1. 在Excel Source之后添加条件拆分

  2. 基于以下表达式分割流

  1. Add a conditional split after the Excel Source
  2. Split the Flow based on the following expression

ISNULL([F1]) == False


如果您不给 F1 的别名,否则使用别名

If you didn't give an alias to F1 otherwise use the alias

最后,请记住,您必须添加一个派生列(如我们在第一个子问题中所述),其中包含日期值

Finally, remember that you must add a derived column (as we said in the first sub-problem) that contains the date value

这篇关于将多个数据从excel加载到sql SSIS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆