当在xpath中什么都没找到时，如何返回NA? [英] How to return NA when nothing is found in an xpath?

查看：161 发布时间：2020/11/24 2:56:18 html r xpath web-scraping html-parsing

本文介绍了当在xpath中什么都没找到时，如何返回NA?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

很难提出问题，但是通过示例，它很容易理解.

It is difficult to formulate the question, but with an example, it is simple to understand.

我使用R解析html代码.

I use R to parse html code.

在下面，我有一个名为html的html代码，然后尝试提取//span[@class="number"]中的所有值和//span[@class="surface"]中的所有值:

In the following, I have a html code called html, then I try to extract all values in //span[@class="number"] and all values in //span[@class="surface"]:

html <- '<div class="line">
<span class="number">Number 1</span>
<span class="surface">Surface 1</span>
</div>
<div class="line">
<span class="surface">Surface 2</span>
</div>' 

page = htmlTreeParse(html,useInternal = TRUE,encoding="UTF-8")

number = unlist(xpathApply(page,'//span[@class="number"]',xmlValue))
surface = unlist(xpathApply(page,'//span[@class="surface"]',xmlValue))

number的输出是:

[1] "Number 1"

surface的输出是:

[1] "Surface 1" "Surface 2"

然后，当我尝试cbind这两个元素时，我不能，因为它们的长度不一样.

Then, when I try to cbind the two elements, I can't, because they don't have the same length.

所以我的问题是:我该怎么做才能为number提供一个输出:

So my question is: what can I do to have an output for number that is:

[1] "Number 1" NA

然后我可以将number和surface组合在一起.

Then I can combine number and surface.

推荐答案

为每个标签选择封闭标签(此处为div)，然后在其中查找每个标签会更容易.使用rvest和purrr，我发现它更简单

It's easier to select the enclosing tag (the div here) for each, and look for each tag inside. With rvest and purrr, which I find simpler,

library(rvest)
library(purrr)

html %>% read_html() %>% 
    html_nodes('.line') %>% 
    map_df(~list(number = .x %>% html_node('.number') %>% html_text(), 
                 surface = .x %>% html_node('.surface') %>% html_text()))

#> # A tibble: 2 × 2
#>     number   surface
#>      <chr>     <chr>
#> 1 Number 1 Surface 1
#> 2     <NA> Surface 2

这篇关于当在xpath中什么都没找到时，如何返回NA?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

当在xpath中什么都没找到时，如何返回NA? [英] How to return NA when nothing is found in an xpath?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

当在xpath中什么都没找到时，如何返回NA? [英] How to return NA when nothing is found in an xpath?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭