在函数内,如果未找到xpath,则返回NA或0 [英] Within a function, return NA or 0 if xpath is not to found
问题描述
在一个函数中,我需要返回"NA".或更好的"0"对于不在(thapge)上不是(!)的xpath项.在大多数页面上,我从列表中抓取到xpath项目存在,但在某些页面上则不存在.如果不存在,则返回向量将变得不对称并且无法进一步组合.
Within a function, I need to return "NA" or better "0" for an xpath item that is NOT (!) on thapge. On most pages I scrape from the list the xpath item exists, but on some not. If it doesn't exists, the return vector becomes asymmetrical and connot be further combined.
return_data <- function(url) {
page <- url %>% read_html
tibble(YealyRevenue = page %>%
html_nodes(xpath = '//div[contains(h4, "YealyRevenue")]') %>%
html_text(trim = TRUE) %>%
parse_number(),
Cashflow = page %>%
html_nodes(xpath = '//div[contains(h4, "Cashflow:")]') %>%
html_text(trim = TRUE) %>%
parse_number(),
Spendings = page %>%
html_nodes(xpath = '//*[@id="Spendings"]/a' ) %>%
html_text(trim = TRUE) %>%
parse_number(),
Return = page %>%
html_nodes(xpath = '//*[@id="Return"]/div[1]/div[2]/div/div[2]/div[2]/h1') %>%
html_text(trim = TRUE))
}
最后一项是在我抓取的所有页面上并不总是存在的一项.
The last item is the one which is not always existent on all the pages I scrape.
Return = page %>%
html_nodes(xpath = '//*[@id="Return"]/div[1]/div[2]/div/div[2]/div[2]/h1') %>%
html_text(trim = TRUE)
因此,我需要类似的东西
So for this, I would need something like
"如果找不到此xpath,请返回"0"
"If this xpath is not found, please return "0"
感谢任何潜在客户!
推荐答案
我们可以用 tryCatch
包裹链,并在存在时指定
.如果有 return
值.>错误 warning
We could wrap the chain with a tryCatch
and specify the return
value when there is an error
. It is also possible to add more return
values in case there are warning
return_data <- function(url) {
page <- url %>% read_html
YealyRevenue <- page %>%
html_nodes(xpath = '//div[contains(h4, "YealyRevenue")]') %>%
html_text(trim = TRUE) %>%
parse_number()
Cashflow <- page %>%
html_nodes(xpath = '//div[contains(h4, "Cashflow:")]') %>%
html_text(trim = TRUE) %>%
parse_number()
Spendings <- page %>%
html_nodes(xpath = '//*[@id="Spendings"]/a' ) %>%
html_text(trim = TRUE) %>%
parse_number()
Return <- tryCatch({
page %>%
html_nodes(xpath =
'//*[@id="Return"]/div[1]/div[2]/div/div[2]/div[2]/h1') %>%
html_text(trim = TRUE)},
error = function(err) {
message("xpath doesn't exist")
return(NA)
})
return(tibble(YearlyRevenue, Cashflow, Spending, Return))
}
这篇关于在函数内,如果未找到xpath,则返回NA或0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!