python - pyspider 代理部分有问题还是我的姿势不对?
本文介绍了python - pyspider 代理部分有问题还是我的姿势不对?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
问 题
同样的代理 socket 地址,配置 requests 能返回请求页面的 dom
配置给 pyspider 返回的是:
[E 170727 13:14:38 base_handler:203] HTTP 599: Failed connect to www.douban.com:808; Resource temporarily unavailable
Traceback (most recent call last):
File "/usr/local/python27/lib/python2.7/site-packages/pyspider/libs/base_handler.py", line 196, in run_task
result = self._run_task(task, response)
File "/usr/local/python27/lib/python2.7/site-packages/pyspider/libs/base_handler.py", line 175, in _run_task
response.raise_for_status()
File "/usr/local/python27/lib/python2.7/site-packages/pyspider/libs/response.py", line 172, in raise_for_status
six.reraise(Exception, Exception(self.error), Traceback.from_string(self.traceback).as_traceback())
File "/usr/local/python27/lib/python2.7/site-packages/pyspider/fetcher/tornado_fetcher.py", line 378, in http_fetch
response = yield gen.maybe_future(self.http_client.fetch(request))
File "/usr/local/python27/lib/python2.7/site-packages/tornado/httpclient.py", line 102, in fetch
self._async_client.fetch, request, **kwargs))
File "/usr/local/python27/lib/python2.7/site-packages/tornado/ioloop.py", line 458, in run_sync
return future_cell[0].result()
File "/usr/local/python27/lib/python2.7/site-packages/tornado/concurrent.py", line 238, in result
raise_exc_info(self._exc_info)
File "<string>", line 3, in raise_exc_info
Exception: HTTP 599: Failed connect to www.douban.com:808; Resource temporarily unavailable
crawl_config = {
'headers' : headers,
'timeout' : 100,
'itag': 'v2',
#'proxy': urllib2.urlopen("http://10.124.81.73:5000/get").read()
'proxy': '114.239.148.103:808'
}
为什么会把代理的 socket 端口连接在豆瓣网的后面去请求??www.douban.com:808????
为什么是 599 ?
解决方案
599是http的状态码,表示网络连接超时异常
说明你配置的'proxy': '114.239.148.103:808'代理无效
这篇关于python - pyspider 代理部分有问题还是我的姿势不对?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文