使用 nodes() 方法在 SQL 中展平分层 XML [英] Flattening hierarchical XML in SQL using the nodes() method
问题描述
我有一个将 XML 文档作为参数的存储过程,其结构类似于以下内容:
I have a Stored Procedure that takes an XML document as a parameter similar in a structure to the following:
<grandparent name="grandpa bob">
<parent name="papa john">
<children>
<child name="mark" />
<child name="cindy" />
</children>
</parent>
<parent name="papa henry">
<children>
<child name="mary" />
</children>
</parent>
</grandparent>
我的要求是将这些数据展平",以便将其插入临时表并在过程中进一步操作,因此上述 XML 变为:
My requirement is to "flatten" this data so that it can be inserted into a temporary table and manipulated further down the procedure, so the above XML becomes:
Grandparent Name Parent Name Child Name
---------------- --------------- ---------------
grandpa bob papa john mark
grandpa bob papa john cindy
grandpa bob papa henry mary
这是目前使用 SQL Server XML 节点完成的:
This is currently being done using SQL Server XML Nodes:
SELECT
VIRT.node.value('../../../@name','varchar(15)') 'Grandparent Name',
VIRT.node.value('../../@name','varchar(15)') 'Parent Name',
VIRT.node.value('@name','varchar(15)') 'Child Name'
FROM
@xmlFamilyTree.nodes('/grandparent/parent/children/child') AS VIRT(node)
这很有效,直到我开始在过程中抛出大量数据(即 1000 多个 child
节点),此时它会停止并需要 1 到 2 分钟的时间来执行.我认为这可能是因为我从最低级别 (<child
) 开始,然后每次出现都遍历备份 XML 文档.将这个单个查询拆分为 3 个块(我需要从中获取数据的每个节点一个)会提高性能吗?鉴于这些节点都没有我可以用来连接备份的密钥",有人可以提供任何指示我可以如何去做吗?
This works great until I start throwing large amounts of data at the procedure (i.e. 1000+ child
nodes), at which point this grinds to a halt and takes between 1 and 2 minutes to execute. I think this may be due to the fact that I am starting off at the lowest level (<child
) and then traversing back up the XML document for each occurance. Would splitting this single query into 3 chunks (one per node that I need data from) improve performance here? Given that none of these nodes have "keys" on them that I could use to join back up with, could anyone offer any pointers how I might be able to go about doing this?
推荐答案
在网上多看了几遍后,我似乎已经回答了我自己的问题:
I seem to have answered my own question after a bit more looking around online:
SELECT
grandparent.gname.value('@name', 'VARCHAR(15)'),
parent.pname.value('@name', 'VARCHAR(15)'),
child.cname.value('@name', 'VARCHAR(15)')
FROM
@xmlFamilyTree.nodes('/grandparent') AS grandparent(gname)
CROSS APPLY
grandparent.gname.nodes('*') AS parent(pname)
CROSS APPLY
parent.pname.nodes('children/*') AS child(cname)
使用CROSS APPLY
我可以选择顶级grandparent
节点并使用它来选择子parent
节点等等.使用这种方法,我已将查询从 1 分 30 秒 左右缩短到 6 秒 左右.
Using CROSS APPLY
I can select the top-level grandparent
node and use this to select the child parent
nodes and so on. Using this method I have taken my query from executing in around 1 minute 30 seconds down to around 6 seconds.
有趣的是,如果我使用旧的"OPEN XML
方法来检索相同的数据,查询会在 1 秒内执行!
Interestingly though, if I use the "old" OPEN XML
method to retrieve the same data, the query executes in 1 second!
似乎您可能必须根据传入的文档的预期大小/复杂性逐案处理这两种技术的使用.
It seems like you may have to approach the use of these two techniques on a case-by-case basis depending on the expected size/complexity of the document being passed in.
这篇关于使用 nodes() 方法在 SQL 中展平分层 XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!