R 中的 XPath:如果缺少节点,则返回 NA [英] XPath in R: return NA if node is missing

查看:23
本文介绍了R 中的 XPath:如果缺少节点,则返回 NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 R 中的 Xpath 在 html 文档中搜索节点.在下面的代码中,我想知道如何在缺少节点时返回 NULL 或 NA:

I'm trying to search for nodes in an html document using Xpath in R. In the code below, I would like to know how return a NULL or NA when a node is missing:

library(XML)
b <- '
<bookstore specialty="novel">
<book style="autobiography">
<author>
<first-name>Joe</first-name>
<last-name>Bob</last-name>
</author>
</book>
<book style="textbook">
<author>
<first-name>Mary</first-name>
<last-name>Bob</last-name>
</author>
<author>
<first-name>Britney</first-name>
<last-name>Bob</last-name>
</author>
<price>55</price>
</book>
<book style="novel" id="myfave">
<author>
<first-name>Toni</first-name>
<last-name>Bob</last-name>
</author>
</bookstore>
'
doc2 <- htmlTreeParse(b, useInternal=T)
xpathApply(doc2, "//author/first-name", xmlValue)

例如,当我在 author 上运行 xpathApply() 函数时,我会得到 4 个结果,但是如果我要删除其中一个 节点,我希望 xpathApply 函数在其位置返回 NULL 或其他内容,我不希望它跳过它.如果我要删除 <first-name>Mary</first-name>,我希望结果看起来像这样:

For instance, when I run the xpathApply() function on author I would get 4 results, but if I was to delete one of the <first-name> nodes, I want the xpathApply function to return a NULL or something else in its place, I dont want it to skip it. I want the result to look like this if I was to delete <first-name>Mary</first-name>:

Joe
NA
Britney
Tony

推荐答案

你可以这样做:

xpathApply(doc2, "//author",
           function(x){
             if("first-name" %in% names(x))
               xmlValue(x[["first-name"]])
             else NA})

[[1]]
[1] "Joe"

[[2]]
[1] NA

[[3]]
[1] "Britney"

[[4]]
[1] "Toni"

这篇关于R 中的 XPath:如果缺少节点,则返回 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆