open.connection(x,“rb") 中的错误:HTTP 错误 406 [英] Error in open.connection(x,"rb") : HTTP error 406

查看:80
本文介绍了open.connection(x,“rb") 中的错误:HTTP 错误 406的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 R 中的 read_html 读取网站的内容.但是,对于像 http://benchmarkrealestate.com/ 这样的网站,我得到了这个错误.open.connection(x,"rb") 错误:HTTP 错误 406

这个错误是什么意思?这仅发生在某些网站上.我试图在网上查找它,但无法找到我收到此错误的确切原因.

我该如何解决这个问题?

解决方案

406 Not Acceptable

<块引用>

请求的资源只能生成内容根据请求中发送的 Accept 标头可接受.

以上句子直接摘自维基百科.

基本上,每当网络爬虫向网站发出请求时,它通常会通过向其操作对等方(即网络服务器)提交特征标识字符串来标识自身、应用程序类型和其他信息.在这种情况下,此标识在名为 User-Agent 的标头字段中传输.

将网页内容返回到您的控制台的一种方法是将您的用户代理信息设置为在 的帮助下可识别的信息curl 包:

库(xml2)图书馆(rvest)图书馆(卷曲)web_content <- read_html(curl('http://benchmarkrealestate.com/', handle = new_handle("useragent" = "Mozilla/5.0")))

您可能还想阅读标题字段.

I am trying to read the contents of a website using read_htmlin R. However, for some websites like http://benchmarkrealestate.com/, I get this error. Error in open.connection(x,"rb") : HTTP error 406

What does this error mean? This only happens in some websites. I tried to look it up online, but wasn't able to find the exact reason why I get this error.

How do I fix this?

解决方案

406 Not Acceptable

The requested resource is capable of generating only content not acceptable according to the Accept headers sent in the request.

The sentence above is lifted right off of Wikipedia.

Basically, whenever a Web crawler makes a request to a website, it often identifies itself, its application type and other information by submitting a characteristic identification string to its operating peer, i.e. the web server. In this case, this identification is transmitted in a header field called User-Agent.

One way to have the content of the web page returned to your console is to set your user-agent information to something identifiable with the help of the curl package:

library(xml2)
library(rvest)
library(curl)

web_content <- read_html(curl('http://benchmarkrealestate.com/', handle = new_handle("useragent" = "Mozilla/5.0")))

You may also want to read up on header fields.

这篇关于open.connection(x,“rb") 中的错误:HTTP 错误 406的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆