XML R 如何检索值(这可能是命名空间问题吗?) [英] XML R How to retrieve values (could this be a namespace issue?)

查看:29
本文介绍了XML R 如何检索值(这可能是命名空间问题吗?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

就在我以为我理解 XPath 的时候!我一定遗漏了一些非常简单的东西,但我无法选择以下节点citedby-count"的值:

Just when I thought I understood XPath! I must be missing something really simple, but I can't select the value of the node "citedby-count" in the following:

xml <- "<?xml version='1.0' encoding='UTF-8'?>
        <search-results xmlns='http://www.w3.org/2005/Atom' xmlns:cto='http://www.elsevier.com/xml/cto/dtd' xmlns:atom='http://www.w3.org/2005/Atom' xmlns:prism='http://prismstandard.org/namespaces/basic/2.0/' xmlns:opensearch='http://a9.com/-/spec/opensearch/1.1/' xmlns:dc='http://purl.org/dc/elements/1.1/'>

            <entry>
                 <prism:url>http://api.elsevier.com/content/abstract/scopus_id/111111</prism:url>
                 <dc:title>Paper Title</dc:title>
                 <citedby-count>1</citedby-count>
            </entry> 
        </search-results>"

doc <- xmlParse(xml)

我试过了

doc["//citedby-count"]

doc["//{'citedby-count'}"]

doc["//entry"]

但都返回

list()
attr(,"class")
[1] "XMLNodeSet"

然而,

doc["//dc:title"] 

工作正常.

我是不是看这个太久了?请帮忙!

Have I just been looking at this too long? Please help!

****我认为这是因为连字符,但不可能是因为

****I thought this was because of the hyphen but it can't be because

doc["//entry"] 

也不行.

推荐答案

公共命名空间前缀声明为 xmlns:foo="...",其中 foo 是前缀,它在元素名称中显式使用为 其中 bar 是元素的本地名称.除此之外还有默认命名空间.它是像 xmlns="..." 一样没有前缀声明的命名空间,并且在声明默认前缀的元素上隐含的用法以及em> 后代元素,除非某些东西覆盖了默认命名空间继承,即具有本地默认命名空间或在后代元素的名称中使用显式前缀.

Common namespace prefix is declared as xmlns:foo="...", where foo is the prefix, and it is used in element name explicitly as <foo:bar> where bar is the element's local-name. Apart from that there is default namespace. It is namespace declared without prefix like xmlns="...", and the usage is implied on the element where default prefix is declared as well as the descendant elements, unless something is overriding the default namespace inheritance i.e having local default namespace or using explicit prefix in the descendant element's name.

这是故事的第一部分,关于 XML 中的命名空间.另一方面,XPath 不知道默认命名空间.在 XPath 中,没有前缀的元素总是被考虑在空命名空间中.为了弥合 XML 和 XPath 在默认命名空间方面的差异,通常当您需要在默认命名空间中查询元素时,您必须定义一个指向 XML 的默认命名空间的前缀,并在 XPath 表达式中使用该前缀.这基本上是 @hrbrmstr 在第一条评论中建议的内容,如下所示(前缀可以是任何东西,只要它是映射到正确的默认命名空间):

That's the first part the story, which is about namespace in XML. On the other hand, XPath has no idea about default namespace. In XPath, element without prefix is always considered in empty namespace. To bridge the difference between XML and XPath regarding default namespace, usually when you need to query element in default namespace, you have to define a prefix pointing to the XML's default namespace and use that prefix in the XPath expression. That's basically what @hrbrmstr suggested in the first comment, something like the following (the prefix can be anything as long as it is mapped to the correct default namespace) :

doc["//d:citedby-count", namespaces=c(d="http://www.w3.org/2005/Atom")]

但结果是您的 XML 有一个显式前缀 atom,它已经指向相同的命名空间 uri,可以直接使用.

but turns out that your XML has an explicit prefix, atom, which already points to the same namespace uri and can be used directly.

这篇关于XML R 如何检索值(这可能是命名空间问题吗?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆