我怎么能忽略它创建试图重命名节点时,一个无限循环幻象XML属性? [英] How can I ignore phantom xml attributes which are creating an infinite loop when trying to rename nodes?

查看:142
本文介绍了我怎么能忽略它创建试图重命名节点时,一个无限循环幻象XML属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在负责一个RESTful Web服务的结果转换成新格式的XML文档。

I've been tasked with converting the results of a restful web service into an XML document with new formatting.

的HTML / XHTML的一个例子将被转换:

An example of the html/xhtml to be converted:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    <head>
        <title>OvidWS Result Set Resource</title>
    </head>
    <body>
        <table id="results">
            <tr>
                <td class="_index">
                  <a class="uri" href="REDACTED">1</a>
                </td>
                <td class="au">
                  <span>GILLESPIE JB</span>
                  <span>KUKES RE</span>
                </td>
                <td class="so">A.M.A. American Journal of Diseases of Children</td>
                <td class="ti">Acetylsalicylic acid poisoning with recovery.</td>
                <td class="ui">20267726</td>
                <td class="yr">1947</td>
              </tr>
              <tr>
                <td class="_index">
                  <a class="uri" href="REDACTED">2</a>
                </td>
                <td class="au">BASS MH</td>
                <td class="so">Journal of the Mount Sinai Hospital, New York</td>
                <td class="ti">Aspirin poisoning in infants.</td>
                <td class="ui">20265054</td>
                <td class="yr">1947</td>
              </tr>
        </table>  
    </body>
</html>

在理想情况下所有我想要做的就是采取任何被列为类的属性,使其元素名称,在情况下,有没有class属性我只是想将其标记为一个项目。

Ideally all I want to do is take whatever is listed as the class attribute and make it the element name, in cases where there is no 'class' attribute I just want to mark it as an item.

这是我在寻找的转换:

<results>
    <citation>
        <_index>
            <uri href="REDACTED">1</uri>
        </_index>
        <au>
            <item>GILLESPIE JB</item>
            <item>KUKES RE</item>
        </au>
        <so>A.M.A. American Journal of Diseases of Children</so>
        <ti>Acetylsalicylic acid poisoning with recovery.</ti>
        <ui>20267726</ui>
        <yr>1947</yr>
    </citation>
    <citation>
        <_index>
            <uri href="REDACTED">2</a>
        </_index>
        <au>BASS MH</au>
        <so>Journal of the Mount Sinai Hospital, New York</so>
        <ti>Aspirin poisoning in infants.</ti>
        <ui>20265054</ui>
        <yr>1947</yr>
    </citation>
</results>  

我发现了一小块code <一个href="http://social.msdn.microsoft.com/Forums/en-US/xmlandnetfx/thread/bd61735f-3fef-4e37-b133-744c7e0123fe"相对=nofollow>此处,让我重新命名一个节点:

I found a little piece of code here which allows me to rename a node:

    Public Shared Function RenameNode(ByVal e As XmlNode, newName As String) As XmlNode
        Dim doc As XmlDocument = e.OwnerDocument
        Dim newNode As XmlNode = doc.CreateNode(e.NodeType, newName, Nothing)
        While (e.HasChildNodes)
            newNode.AppendChild(e.FirstChild)
        End While
        Dim ac As XmlAttributeCollection = e.Attributes
        While (ac.Count > 0) 
            newNode.Attributes.Append(ac(0))
        End While
        Dim parent As XmlNode = e.ParentNode
        parent.ReplaceChild(newNode, e)
        Return newNode
    End Function

但遍历XmlAttributeCollection时出现问题。出于某种原因,当在TD的一个节点看,没有出现在源2的属性神奇地出现:ROWSPAN和合并单元格。看来这些属性都搞乱了迭代器,因为它们被消费的时候,他们不从属性列表中消失,如类的属性一样。相反,属性的值被消耗(从1改变为)。这导致无限循环。

But a problem arises when iterating over the XmlAttributeCollection. For some reason when looking at one of the td nodes, 2 attributes that don't appear in the source magically appear: rowspan and colspan. It seems these attributes are messing with the iterator as when they are consumed, they do not disappear from the attribute list like the 'class' attribute does. Instead, the value of the attribute is consumed (changing from "1" to ""). This results in an infinite loop.

我注意到,他们是类型为XMLUnspecifiedAttribute,但是当我修改循环检测:

I note that they are of type 'XMLUnspecifiedAttribute', but when I modify the loop to detect that:

While (ac.Count > 0) And Not TypeOf (ac(0)) Is System.Xml.XmlUnspecifiedAttribute
    newNode.Attributes.Append(ac(0))
End While

我收到以下错误:

I get the following error:

System.Xml.XmlUnspecifiedAttribute is not accessible in this context because it is 'friend'

任何想法,为什么这种情况正在发生或如何解决它?

Any ideas why this is happening or how to work around it?

推荐答案

我想你遇到的问题确实是您的文档类型声明。

I think the problem you are having is indeed your doc type declaration.

既然你还有翻译完全节点到的东西,那么我会说,你甚至不需要它,并能<一href="http://stackoverflow.com/questions/12428478/xmldocument-loadxml-hanging-for-several-minutes">safely忽略它的。

since you are translating the nodes into something else completely then I would say you don't even need it and can safely ignore it.

由于我不包括在我的测试中,然后当我把它的的XmlResolver去失控,我假设你当然不需要在这里。

Since I was not including it in my tests, and then when I included it the xmlresolver went haywire, I am assuming you certainly don't need it here.

您可以通过解析设置为没有忽视它

You can ignore it by setting the resolver to nothing:

{xml document object}.Xmlresolver = nothing

然后你做你的节点和程序选择。我这样做,即使在源文件中的文档类型,仍然没有问题。

Then you do your select for the node and process. I did this even with the doc type in the source file and still had no issues.

下面是code我用来测试:

Here is the code I used to test:

Private Sub Form1_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
    Dim USEDoc As New XmlDocument

    Dim theNameManager As System.Xml.XmlNamespaceManager = New System.Xml.XmlNamespaceManager(USEDoc.NameTable)
    theNameManager.AddNamespace("xhtml", "http://www.w3.org/1999/xhtml")

    USEDoc.XmlResolver = Nothing
    USEDoc.Load("RestServ.txt")

    renameNodes(USEDoc.SelectSingleNode("descendant::xhtml:table", theNameManager))

    Dim SaveDoc As New XmlDocument
    SaveDoc.AppendChild(SaveDoc.ImportNode(USEDoc.SelectSingleNode("//results", theNameManager), True))

    SaveDoc.Save("RestServConv.xml")
End Sub

Public Function renameNodes(ByVal TopNode As XmlNode) As Boolean
    Dim UseNode As XmlNode

    If TopNode.Name <> "#text" Then
        If TopNode.Name = "tr" Then
            UseNode = RenameNode(TopNode, "citation")
        ElseIf TopNode.Name = "table" Then
            UseNode = RenameNode(TopNode, "results")
            UseNode.Attributes.RemoveNamedItem("id")
        ElseIf TopNode.Attributes.Count > 0 Then
            For Each oAttribute As XmlAttribute In TopNode.Attributes
                If oAttribute.Name = "class" Then
                    UseNode = RenameNode(TopNode, oAttribute.Value)
                    UseNode.Attributes.RemoveNamedItem("class")
                    Exit For
                End If
            Next oAttribute
        End If

        If UseNode IsNot Nothing Then
            If UseNode.ChildNodes.Count > 0 Then
                Dim x As Integer
                For x = 0 To UseNode.ChildNodes.Count - 1
                    renameNodes(UseNode.ChildNodes(x))
                Next x
            End If
        End If
    End If

    Return True
End Function

Public Shared Function RenameNode(ByVal e As XmlNode, ByVal newName As String) As XmlNode
    Dim doc As XmlDocument = e.OwnerDocument
    Dim newNode As XmlNode = doc.CreateNode(e.NodeType, newName, Nothing)
    While (e.HasChildNodes)
        newNode.AppendChild(e.FirstChild)
    End While
    Dim ac As XmlAttributeCollection = e.Attributes
    While (ac.Count > 0)
        newNode.Attributes.Append(ac(0))
    End While
    Dim parent As XmlNode = e.ParentNode
    parent.ReplaceChild(newNode, e)
    Return newNode
End Function

我通过你的例如文件,我得到的结果是这样的:

I passed in your example document and the result I got was this:

<results>
  <citation>
    <_index>
      <uri href="REDACTED">1</uri>
    </_index>
    <au>
      <span xmlns="http://www.w3.org/1999/xhtml">GILLESPIE JB</span>
      <span xmlns="http://www.w3.org/1999/xhtml">KUKES RE</span>
    </au>
    <so rowspan="1" colspan="1">A.M.A. American Journal of Diseases of Children</so>
    <ti>Acetylsalicylic acid poisoning with recovery.</ti>
    <ui>20267726</ui>
    <yr>1947</yr>
  </citation>
  <citation>
    <_index>
      <uri href="REDACTED">2</uri>
    </_index>
    <au>BASS MH</au>
    <so>Journal of the Mount Sinai Hospital, New York</so>
    <ti>Aspirin poisoning in infants.</ti>
    <ui>20265054</ui>
    <yr>1947</yr>
  </citation>
</results>

这篇关于我怎么能忽略它创建试图重命名节点时,一个无限循环幻象XML属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆