跳过R中的错误进行循环,并在每次迭代中暂停该过程 [英] Skip errors in R for loops and also pause the process in each iteration
问题描述
关于 R
中的循环,我有两个问题。
I two questions regarding loops in R
.
1)我正在使用 XML
包从网站上抓取一些表,并使用<$ c组合它们$ c> rbind 。我正在使用以下命令,如果给定网站中存在价格数据和表格,则该命令可以正常工作。
1) I'm using XML
package to scrap some tables from the website and combine them using rbind
. I'm using following command and it is working without issues if price data and tables are present in the given websites.
url.list <- c("www1", "www2", "www3")
for(url_var in url.list)
{
url <- url_var
url.parsed <- htmlParse(getURL(url), asText = TRUE)
tableNodes <- getNodeSet(url.parsed, '//*[@id="table"]/table')
newdata <- readHTMLTable(tableNodes[[1]], header=F, stringsAsFactors=F)
big.data <- rbind(newdata, big.data)
Sys.sleep(30)
}
但有时网页没有对应的表格(在这种情况下,剩下一个带有消息的变量表:未报告当前价格。
),我的循环因以下错误消息而停止(由于表列数不匹配):
But sometimes web page does not have corresponding table (in this case I'm left with one variable table with the message: No current prices reported.
) and my loop stops with following error message (since number of table columns do not match):
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
我要 R
忽略该错误并继续下一个网页(跳过具有不同列数的网页)。
I want R
to ignore the error and go ahead with the next web page (skipping the one that has different number of columns).
2)在循环的最后,我有 Sys.sleep(30)
。是否会迫使 R
等待30秒才能尝试下一个网页。
2) In the end of the loop I have Sys.sleep(30)
. Does it force R
to wait 30 seconds before it tries next web page.
谢谢
推荐答案
在评论中提到@RuiBarradas, tryCatch
是我们处理错误的方法(或R)。具体来说,您需要的是在出现错误时进行下一次迭代,因此您可以执行以下操作:
As @RuiBarradas Mentioned in the comment, tryCatch
is the way we handle errors (or even warnings) in R. Specifically in your case, what you need is going to next iteration when there are errors, So you can do like:
for (url_var in url.list) {
url <- url_var
url.parsed <- htmlParse(getURL(url), asText = TRUE)
tryCatch({
# Try to run the code within these braces
tableNodes <- getNodeSet(url.parsed, '//*[@id="table"]/table')
newdata <- readHTMLTable(tableNodes[[1]], header=F, stringsAsFactors=F)
big.data <- rbind(newdata, big.data)
},
# If there are errors, go to next iteration
# Sys.sleep(30) won't be executed in such case
error = next())
Sys.sleep(30)
}
是的, Sys.sleep(30)
使R在执行时休眠30秒。因此,如果您希望R在每次迭代中始终处于休眠状态,无论解析是否成功,都可以考虑将该行移动到 tryCatch
的前面。
And yes, Sys.sleep(30)
makes R sleep for 30 seconds when it is executed. Thus, if you want R to always sleep in every iteration no matter the parsing is successful or not, you may consider moving that line in front of tryCatch
.
有关详细信息,请参见如何在R中编写trycatch 中写得很好的答案详细说明 tryCatch
。
See the well-written answer in How to write trycatch in R for more detailed elaboration of tryCatch
.
这篇关于跳过R中的错误进行循环,并在每次迭代中暂停该过程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!