使用 SSIS 或纯 T-SQL 导入和验证 XML 文件? [英] Importing and validating XML file using SSIS or just plain T-SQL?

查看:30
本文介绍了使用 SSIS 或纯 T-SQL 导入和验证 XML 文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 SQL Server 中将 XML 文件导入和验证到单个表(扁平化)时的最佳做法是什么?

What is the best practice when importing and validating an XML file to a single table (flattened) in SQL Server ?

我有一个 XML 文件,其中包含大约 15 种复杂类型,它们都与单个父元素相关.SSIS 设计可能如下所示:但是所有这些 (15) 个连接变得非常复杂.

I've a XML file which contains about 15 complex types which are all related to a single parent element. The SSIS design could look like this: But it's getting very complicated with all those (15) joins.

将 T-SQL 代码写入以下是否更好:
1) 将 XML 导入到 XML 类型并链接到 XSD 架构的列中.
2) 使用此代码:

Is it maybe a better idea to just write T-SQL code to :
1) Import the XML into a column which is of the type XML and is linked to a XSD-schema.
2) Use this code:

TRUNCATE TABLE XML_Import
INSERT INTO XML_Import(ImportDateTime, XmlData)
SELECT GETDATE(), XmlData 
FROM
(
    SELECT  * 
    FROM    OPENROWSET (BULK 'c:\XML-Data.xml', SINGLE_BLOB) AS XMLDATA
) AS FileImport (XMLDATA)

delete from dbo.UserFlat
INSERT INTO dbo.UserFlat
SELECT
    user.value('(UserIdentifier)', 'varchar(8)') as UserIdentifier,
    user.value('(Emailaddress)', 'varchar(70)') as Emailaddress,
    businessaddress.value('(Fax)', 'varchar(70)') as Fax,
    employment.value('(EmploymentData)', 'varchar(8)') as EmploymentData,
    -- More values here ...
FROM  
    XML_Import CROSS APPLY
    XmlData.nodes('//user') AS User(user) CROSS APPLY
    user.nodes('BusinessAddress') AS BusinessAddress(businessaddress) CROSS APPLY
    user.nodes('Employment') AS Employment(employment)
    -- More 'joins' here ...

填充UserFlat"表?
一些缺点是您必须手动键入 SQLcode,但这里的优点是我可以更直接地控制元素的处理和转换方式.但是我不知道在SSIS中处理XML和用T-SQL XML语句处理XML是否有性能差异.

to fill the 'UserFlat' table ?
Some disadvantages are that you have to manually type the SQLcode, but the advantage here is that I have more direct control how the elements are processed and converted. But I don't know if there are any performance differences between processing XML in SSIS and processing the XML with T-SQL XML statements.


请注意,其他一些要求是:


Note that some other requirements are:

  1. 错误处理:如果出现错误,必须向某人发送电子邮件.
  2. 能够处理具有特定文件名模式的多个输入文件:XML_{date}_{time}.xml
  3. 将处理过的 XML 文件移动到不同的文件夹.

请指教.

推荐答案

根据您提到的要求,我会说您可以两全其美(T-SQL 和 SSIS).

Based on the requirements that you have mentioned, I would say that you can use best of both the worlds (T-SQL & SSIS).

我觉得 T-SQL 在加载您在问题中描述的 XML 数据方面提供了更大的灵活性.

I feel that T-SQL gives more flexibility in loading the XML data that you have described in the question.

有很多不同的方法可以实现这一目标.这是一种可能的选择:

There are lot of different ways you can achieve this. Here is one possible option:

  1. 创建一个将 XML 文件的路径作为输入参数的存储过程.

  1. Create a Stored Procedure that would take the path of the XML file as input parameter.

使用您觉得更简单的 T-SQL 方式执行 XML 数据加载操作.

Perform your XML data load operation using the T-SQL way which you feel is easier.

使用 SSIS 包执行错误处理、文件处理、存档和发送电子邮件.

Use SSIS package to perform error handling, file processing, archiving and send email.

使用 SSIS 中可用的日志记录功能.它只需要简单的配置.以下示例展示了如何在 SSIS 中配置日志记录 如何在 SSIS 数据流任务中跟踪成功处理或失败的行的状态?

Use logging feature available in SSIS. It just requires simple configuration. Here is a samples that show how to configure logging in SSIS How to track status of rows successfully processed or failed in SSIS data flow task?

您的流程的示例模型如下面的屏幕截图所示.使用 Foreach 循环容器循环文件.将文件路径作为参数传递给执行 SQL 任务,后者将调用您提到的 T-SQL.处理完文件后,使用文件系统任务将文件移动到存档文件夹.

A sample mock up of your flow would be as shown below in the screenshot. Loop the files using Foreach Loop container. Pass the file path as parameter to Execute SQL Task, which in turn would call the T-SQL that you had mentioned. After processing the file, using the File System Task to move the file to an archive folder.

SSIS 读取多个 xml 文件中使用的示例来自文件夹展示了如何使用 Foreach 循环容器遍历文件.它遍历 xml 文件,但使用数据流任务,因为 xml 文件的格式更简单.

Sample used in SSIS reading multiple xml files from folder shows how to loop through files using Foreach loop container. It loops through xml files but uses Data Flow Task because the xml files are in simpler format.

如何使用 SSIS 包从电子邮件正文中的表格发送记录? 展示了如何使用发送邮件任务发送电子邮件.

Sample used in How to send the records from a table in an e-mail body using SSIS package? shows how to send e-mail using Send Mail Task.

如何在处理完文件后将文件移动到存档文件夹? 展示了如何将文件移动到存档文件夹.

Sample used in How do I move files to an archive folder after the files have been processed? shows how to move files to an Archive folder.

在 SSIS 中的文件系统任务之后分支而不使包失败 显示了即使在特定任务失败后如何继续包执行.即使 Foreach 循环失败,这也将帮助您继续执行包,以便您可以发送电子邮件.屏幕截图中的蓝色箭头表示上一个任务完成.

Sample used in Branching after a file system task in SSIS without failing the package shows how to continue package execution even after a particular task fails. This will help you to proceed with package execution even if Foreach Loop fails so you can send email. Blue arrow in the screenshot indicates on completion of previous task.

如何使用 SSIS 包中的 Foreach 循环容器选择最近创建的文件夹? 展示了如何执行模式匹配.

Sample used in How do I pick the most recently created folder using Foreach loop container in SSIS package? shows how to perform pattern matching.

希望能给你一个想法.

这篇关于使用 SSIS 或纯 T-SQL 导入和验证 XML 文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆