将多个数据从excel加载到sql SSIS [英] load multiple data from excel to sql SSIS
问题描述
我正在使用SSIS,我需要使用SSIS将具有以下(Yellos)格式的多个文件加载到SQL
I am working with SSIS and I need to load multiple files with the following (Yellos) format to SQL using SSIS
您所看到的问题是,如果填充了列A(例如:忽略行#14),则文件具有仅处理/使用记录的可怕格式。 -X),我需要将D1中的值插入日期列。
The problem as you can see is that the files has an horrible format only process / consume records if the column A is populated (e.g: ignoring rows# 14 - X ) and I need to insert the value in D1 into the Date column.
有什么建议吗?
推荐答案
让我们将此问题分为3个子问题:
- 从
D1
获取日期值 - 从第4行开始读取
- 忽略Column1为NULL的所有行
- Get the date value from
D1
- Start Reading from Row number 4
- Ignore all Rows where Column1 is NULL
解决方案
1。从D1获取日期值
Solution
1. Get the date value from D1
- 创建2个SSIS变量,
@ [用户:: FilePath]
(类型为字符串),其中包含excel文件路径,@ [User :: FileDate]
(字符串类型),我们将使用它来存储日期值 - 添加脚本任务,选择脚本语言为 Visual Basic em>
- 选择
@ [User :: FilePath]
作为 ReadOnly变量和@ [User :: FileDate]
作为 ReadWrite变量 - 打开脚本编辑器,然后使用以下代码检索日期值并将其存储到
@ [User :: FileDate]
- Create 2 SSIS variables,
@[User::FilePath]
(of type string) that contains the excel file path,@[User::FileDate]
(of type string) that we will use it to store the date value - Add a script Task, choose the script language as Visual Basic
- Select
@[User::FilePath]
as a ReadOnly variable and@[User::FileDate]
as a ReadWrite variable - Open the Script Editor and use the following code to retrieve the Date Value and store it into
@[User::FileDate]
这将搜索名为退款
的工作表,并从中提取日期值,并将该值存储到 @ [User :: FileDate]
This will search for the sheet named Refunds
and extract the date value from it and store this value into @[User::FileDate]
m_strExcelPath = Dts.Variables.Item("FilePath").Value.ToString
Dim strSheetname As String = String.Empty
Dim strDate as String = String.Empty
m_strExcelConnectionString = Me.BuildConnectionString()
Try
Using OleDBCon As New OleDbConnection(m_strExcelConnectionString)
If OleDBCon.State <> ConnectionState.Open Then
OleDBCon.Open()
End If
'Get all WorkSheets
m_dtschemaTable = OleDBCon.GetOleDbSchemaTable(OleDbSchemaGuid.Tables,
New Object() {Nothing, Nothing, Nothing, "TABLE"})
'Loop over work sheet to get the first one (the excel may contains temporary sheets or deleted ones
For Each schRow As DataRow In m_dtschemaTable.Rows
strSheetname = schRow("TABLE_NAME").ToString
If Not strSheetname.EndsWith("_") AndAlso strSheetname.EndsWith("$") Then
If Not strSheetname.Tolower.Contains("refunds") Then Continue For
Using cmd As New OleDbCommand("SELECT * FROM [" & strSheetname & "A1:D1]", OleDBCon)
Dim dtTable As New DataTable("Table1")
cmd.CommandType = CommandType.Text
Using daGetDataFromSheet As New OleDbDataAdapter(cmd)
daGetDataFromSheet.Fill(dtTable)
'Get Value from column 4 (3 because it is a zero-based index
strDate = dtTable.Rows(0).Item(3).ToString
End Using
End Using
'when the first correct sheet is found there is no need to check others
Exit For
End If
Next
OleDBCon.Close()
End Using
Catch ex As Exception
Throw New Exception(ex.Message, ex)
End Tr
Dts.Variables.Item("FileDate").Value = strDate
Dts.TaskResult = ScriptResults.Success
End Sub
-
在DataFlow任务中添加
派生列转换
,添加具有以下表达式的派生列
In the DataFlow Task add a
Derived Column Transformation
, add a derived column with the following expression
@[User::FileDate]
2。从第4行开始读取
2. Start Reading from Row Number 4
我们假设Excel文件路径存储在 @中[User :: FilePath]
- 首先打开
Excel Connection管理器
,并取消选中框第一行具有列名
- 在数据流任务中,双击excel source
- 将源设置为
SQL命令
- 使用以下命令:
SELECT * FROM [Refunds $ A4:D]
,因此它将从第4行开始读取 - 列名将如下所示F1 ... F4,在excel源代码中,您可以转到Columns选项卡并为列名指定别名,因此在数据流任务中,它们将以其别名显示
- First open the
Excel Connection Manager
and uncheck the boxFirst row has column names
- In the DataFlow Task, double click on the excel source
- Set the source to
SQL Command
- Use the following command:
SELECT * FROM [Refunds$A4:D]
, so it will start reading from the row number 4 - Columns names will be as the following F1 ... F4 , in the excel source you can go to the Columns Tab and give alias to the columns names, so in the data flow task they will be showed with their aliases
3。忽略Column1为NULL的所有行
3. Ignore all Rows Where Column1 is NULL
- 在Excel Source之后添加条件拆分
-
基于以下表达式分割流
- Add a conditional split after the Excel Source
Split the Flow based on the following expression
ISNULL([F1]) == False
如果您不给 F1
的别名,否则使用别名
If you didn't give an alias to F1
otherwise use the alias
最后,请记住,您必须添加一个派生列(如我们在第一个子问题中所述),其中包含日期值
Finally, remember that you must add a derived column (as we said in the first sub-problem) that contains the date value
这篇关于将多个数据从excel加载到sql SSIS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!