如何使用eventlet green修复python,urlopen错误[Errno 8] [英] how to fix python, urlopen error [Errno 8], using eventlet green

查看:365
本文介绍了如何使用eventlet green修复python,urlopen错误[Errno 8]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此处是Python新手.

Python novice here.

我正在使用eventlet和urllib2发出许多异步http请求.在我文件的顶部

I'm making a lot of asynchronous http requests using eventlet and urllib2. At the top of my file I have

import eventlet
import urllib
from eventlet.green import urllib2

然后,我发出许多异步http请求,并在此行中成功:

Then I make a lot of asynchronous http requests that succeed with this line:

conn = urllib2.urlopen(signed_url, None)

突然,我得到了这个错误:

And all of a sudden, I get this error:

URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

此错误发生在同一urllib2.urlopen行上,这很奇怪,因为它之前已经成功了很多次.另外,当我打印signed_url并将其粘贴到浏览器中时,我得到了正确的响应,因此url的格式正确.

This error occurs on the same urllib2.urlopen line, which is weird because it succeeded many times before. Also, when I print the signed_url and then just paste it to my browser, I get a proper response, so the url is properly formatted.

我在帖子中回弹,但是找不到合适的调试策略.从概念上讲,什么可能导致此错误?您如何推荐我去解决它?

I've bounced around posts, but cannot find the right debugging strategy for this. Conceptually, what can be causing this error? And how do you recommend I go about fixing it?

我正在使用Python 2.7.6.

I'm using Python 2.7.6.

谢谢.

推荐答案

未知节点名"错误表示DNS解析失败.最可能的原因是上游DNS服务器速率限制.如果您认真进行网络爬网,我可以推荐两种方法:

The 'nodename not known' error means DNS resolution failed. Most likely cause is upstream DNS server rate limit. If you do web crawling seriously, I can recommend two approaches:

  • 简单:遇到此错误后,只需降低并发限制,每分钟发出的请求就更少.将此错误的前N次出现视为暂时的,在稍有延迟后重复获取URL.设置本地缓存递归DNS服务器(例如dnsmasq,未绑定).
  • 难:拆分DNS解析和HTTP提取.有一个单独的DNS名称队列来解析.将URL http://1.2.3.4/pathHost: domain标头中的已解析IP地址传递给urlopen.这将允许分别限制DNS请求和实际HTTP请求的并发性.如果您每个唯一主机最多只获取一个请求,这将无济于事.找到许多递归DNS服务器来分配工作,收集其响应时间统计信息,更频繁地使用速度更快的服务器.
  • easy: upon getting this error, just throttle down your concurrency limit, make fewer requests per minute. Treat first N occurrences of this error as temporary, repeat fetching of URL after a little delay. Setup local caching recursive DNS server (e.g. dnsmasq, unbound).
  • hard: split DNS resolving and HTTP fetching. Have a separate queue of DNS names to resolve. Pass resolved IP address in URL http://1.2.3.4/path and Host: domain header to urlopen. This will allow to limit concurrency of DNS requests and actual HTTP requests separately. This will not help if you mostly fetch only one request per unique host. Find yourself many recursive DNS servers to distribute work, collect their response time stats, use faster ones more frequently.

这篇关于如何使用eventlet green修复python,urlopen错误[Errno 8]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆