如何在使用 SSIS 导入数据库之前验证 CSV 文件? [英] How to validate a CSV file before importing into the database using SSIS?
问题描述
我有包含三列的 CSV 文件.
I have CSV file with three columns.
sno sname quantity
--- ----- --------
1 aaa 23
2 bbb null
3 ccc 34
4 ddd ddd
5 eee xxx
6 fff 87
SQL Server 数据库中的表如下/
Table in the SQL Server database is as following/
CREATE TABLE csvtable
( sno int
, sname varchar(100)
, quantity numeric(5,2)
)
我创建了一个 SSIS 包来将 csv 文件数据导入到数据库表中.我在包执行过程中遇到错误,因为数量是一个字符串.我创建了另一个表来存储无效数据.
I created an SSIS package to import csv file data into the database table. I am getting an error during package execution because the quantity is a string. I created another table to store the invalid data.
CREATE TABLE wrongcsvtable
( sno nvarchar(10)
, sname nvarchar(100)
, quantity nvarchar(100)
)
在csvtable中,我想存储以下数据.
In the csvtable, I would like to store the following data.
sno sanme quantity
--- ------ --------
1 aaa 23
3 ccc 34
6 fff 87
在wrongcsvtable中,我想存储以下数据.
In the wrongcsvtable, I would like to store the following data.
sno sanme quantity
--- ------ --------
2 bbb null
4 ddd ddd
5 eee xxx
有人能指出我正确的方向来实现上述输出吗?
Could someone point me in the right direction to achieve the above mentioned output?
推荐答案
这是一种可能的选择.您可以使用 Data Flow Task
中的 Data Conversion
转换来实现这一点.以下示例显示了如何实现这一点.该示例使用 SSIS 2005 和 SQL Server 2008 数据库.
Here is one possible option. You can achieve this using the Data Conversion
transformation within the Data Flow Task
. Following example shows how this can be achieved. The example uses SSIS 2005 with SQL Server 2008 database.
分步过程:
创建一个名为
FlatFile.CSV
的文件并用数据填充它,如屏幕截图 #1 所示.
Create a file named
FlatFile.CSV
and populate it with data as shown in screenshot #1.
在 SQL 数据库中,使用 SQL 脚本 下提供的脚本创建名为 dbo.CSVCorrect
和 dbo.CSVWrong
的两个表部分.表 dbo.CSVWrong
中的字段应具有数据类型 VARCHAR 或 NVARCHAR 或 CHAR 以便它可以接受无效记录.
In the SQL database, create two tables named dbo.CSVCorrect
and dbo.CSVWrong
using the scripts provided under SQL Scripts section. The fields in the table dbo.CSVWrong
should have the data types VARCHAR or NVARCHAR or CHAR so that it can accept the invalid records.
在 SSIS 包上,创建一个名为 SQLServer 的 OLE DB 连接以连接到 SQL Server 数据库并创建一个名为 CSV 的平面文件连接.请参阅屏幕截图 #2.配置平面文件连接 CSV,如屏幕截图 #3 - #7 中所示.平面文件连接中的所有列都应配置为 string 数据类型,以便在读取文件时包不会失败.
On the SSIS package, create an OLE DB connection named SQLServer to connect to SQL Server database and create a Flat File Connection named CSV. Refer screenshot #2. Configure the flat file connection CSV as shown in screenshots #3 - #7. All the columns in the flat file connection should be configured as string data type so that the package doesn't fail while reading the file.
在包的 Control Flow 选项卡上,放置一个 Data Flow Task
,如屏幕截图 #8 所示.
On the Control Flow tab of the package, place a Data Flow Task
as shown in screenshot #8.
在包的数据流选项卡上,放置一个Flat File Source
并按照屏幕截图 #9 和 #10<中所示进行配置/strong>.
On the Data Flow tab of the package, place a Flat File Source
and configure it as shown in screenshots #9 and #10.
在包的数据流"选项卡上,放置一个数据转换
转换并按屏幕截图 #11 所示对其进行配置.单击Configure Error Output
并将Error 和Truncation 列值从Fail component 更改为Redirect行.请参阅屏幕截图 #12.
On the Data Flow tab of the package, place a Data Conversion
transformation and configure it as shown in screenshot #11. Click on the Configure Error Output
and change the Error and Truncation column values from Fail component to Redirect row. Refer screenshot #12.
在包的数据流选项卡上,放置一个OLE DB Destination
并将绿色箭头从数据转换连接到这个 OLE DB 目标.如屏幕截图 #13 和 #14 所示配置 OLE DB 目标.
On the Data Flow tab of the package, place an OLE DB Destination
and connect the green arrow from Data Conversion to this OLE DB Destination. Configure the OLE DB Destination as shown in screenshots #13 and #14.
在包的数据流选项卡上,放置另一个OLE DB Destination
并将红色箭头从数据转换连接到这个 OLE DB 目标.如屏幕截图 #15 和 #16 所示配置 OLE DB 目标.
On the Data Flow tab of the package, place another OLE DB Destination
and connect the red arrow from Data Conversion to this OLE DB Destination. Configure the OLE DB Destination as shown in screenshots #15 and #16.
屏幕截图 #17 显示完成配置后的数据流任务.
Screenshot #17 shows the Data Flow Task once it has been completely configured.
屏幕截图 #18 显示了包执行之前表中的数据.
Screenshot #18 shows data in the tables before the package execution.
屏幕截图 #19 显示了数据流任务中的包执行情况.
Screenshot #19 shows package execution within Data Flow Task.
屏幕截图 #20 显示包执行后表中的数据.
Screenshot #20 shows data in the tables after the package execution.
希望有所帮助.
SQL 脚本:
CREATE TABLE [dbo].[CSVCorrect](
[Id] [int] IDENTITY(1,1) NOT NULL,
[SNo] [int] NULL,
[SName] [varchar](50) NULL,
[QuantityNumeric] [numeric](18, 0) NULL,
CONSTRAINT [PK_CSVCorrect] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
CREATE TABLE [dbo].[CSVWrong](
[Id] [int] IDENTITY(1,1) NOT NULL,
[SNo] [varchar](50) NULL,
[Quantity] [varchar](50) NULL,
[SName] [varchar](50) NULL,
[ErrorCode] [int] NULL,
[ErrorColumn] [int] NULL,
CONSTRAINT [PK_CSVWrong] PRIMARY KEY CLUSTERED ([Id] ASC)) ON [PRIMARY]
GO
屏幕截图 #1:
屏幕截图 #2:
屏幕截图 #3:
屏幕截图 #4:
屏幕截图 #5:
屏幕截图 #6:
屏幕截图 #7:
截图 #8:
屏幕截图 #9:
屏幕截图 #10:
屏幕截图 #11:
屏幕截图 #12:
屏幕截图 #13:
屏幕截图 #14:
屏幕截图 #15:
屏幕截图 #16:
屏幕截图 #17:
屏幕截图 #18:
屏幕截图 #19:
屏幕截图 #20:
这篇关于如何在使用 SSIS 导入数据库之前验证 CSV 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!