使用GET检查列表和url的有效性 [英] checking validity of a list og urls using GET

查看:164
本文介绍了使用GET检查列表和url的有效性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个需要验证的URL的.csv文件.

i have a .csv file of URLS that i need to validate.

我想将httr的GET应用于数据帧的每一行.

i want to apply GET of httr to every row of the data frame.

 > websites
          website
1   www.msn.com
2   www.wazl.com
3  www.amazon.com
4 www.rifapro.com

我确实找到了类似的问题,并尝试应用提供的答案;但是不起作用.

I did find similar questions and tried to apply the provided answers; however not working.

> apply(websites, 1, transform, result=GET(websites$website))


  Error: length(url) == 1 is not TRUE


> apply(websites, websites[,1], GET())
Error in handle_url(handle, url, ...) : 
  Must specify at least one of url or handle

我不确定自己做错了什么.

i am not sure what i am doing wrong.

推荐答案

您可以做类似的事情

websites <- read.table(header=T, text="website
1   www.msn.com
2   www.wazl.com
3  www.amazon.com
4 www.rifapro.com")
library(httr)
urls <- paste0(ifelse(grepl("^https?://", websites$website, ig=T), "", "http://"),
          websites$website)
lst <- lapply(unique(tolower(urls)), function(url) try(HEAD(url), silent = T))
names(lst) <- urls
sapply(lst, function(x) if (inherits(x, "try-error")) -999 else status_code(x))
# http://www.msn.com    http://www.wazl.com  http://www.amazon.com http://www.rifapro.com 
#                200                   -999                    405                   -999 

不需要GET请求恕我直言.

No need for a GET request imho.

这篇关于使用GET检查列表和url的有效性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆