没有项目的scrapy代理中间件 [英] scrapy proxy middleware without project

查看：54 发布时间：2021/6/26 20:08:44 python-2.7 scrapy

本文介绍了没有项目的scrapy代理中间件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用scrapy 的runspider 方法来运行我在没有项目的情况下设置和定义的蜘蛛.我正在设置我的自定义设置和下载器中间件来定义 http 代理中间件，如下所示:

I am using scrapy's runspider method to run a spider that I've setup and defined without a project. I am setting up my custom settings and Downloader Middlewares to define an http proxy middleware as follows:

custom_settings = { 'DOWNLOADER_MIDDLEWARES': { 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware': 750 } }

然后在我的请求中使用

request.meta['proxy'] = "proxy-ip:proxy-port"

让步请求

and then calling it in my request with

request.meta['proxy'] = "proxy-ip:proxy-port"

yield request

但蜘蛛没有运行并说:

文件/usr/lib/python2.7/dist-packages/twisted/internet/abstract.py", line 522, in isIPv6Address if '%' in addr: TypeError: argument of type 'NoneType' is not iterable

but the spider does not run and says:

File "/usr/lib/python2.7/dist-packages/twisted/internet/abstract.py", line 522, in isIPv6Address if '%' in addr: TypeError: argument of type 'NoneType' is not iterable

我做错了什么?

推荐答案

经过大量的挖掘(Scrapy 没有太多日志记录，恐怕)，我发现这个问题可能是没有指定方案导致的在代理地址中；即，Scrapy 期望代理作为 URI 传递，因此在您的情况下，而不是:

After a lot of digging (not much logging going on in Scrapy, I'm afraid), I found that this problem can be caused by not specifying the scheme in the proxy address; i.e., Scrapy expects the proxy to be passed as a URI, so in your case, instead of:

request.meta['proxy'] = "proxy-ip:proxy-port"  # doesn't work

你想要这个:

request.meta['proxy'] = "http://proxy-ip:proxy-port"  # does work

(据我所知，http 只是被忽略了，但没有它，urlparse 无法解析其余部分).

(As far as I can make out, the http is just ignored, but without it the rest can't be parsed by urlparse).

这篇关于没有项目的scrapy代理中间件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

没有项目的scrapy代理中间件 [英] scrapy proxy middleware without project

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

没有项目的scrapy代理中间件 [英] scrapy proxy middleware without project

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭