Scrapy 和代理 [英] Scrapy and proxies
问题描述
您如何利用 Python 网页抓取框架 Scrapy 的代理支持?
来自 Scrapy 常见问题、
<块引用>Scrapy 是否适用于 HTTP 代理?
是的.通过 HTTP 代理下载器中间件提供对 HTTP 代理的支持(自 Scrapy 0.8 起).请参阅HttpProxyMiddleware代码>
.
使用代理的最简单方法是设置环境变量http_proxy
.这是如何完成的取决于您的外壳.
如果你想使用https代理并访问https web,设置环境变量http_proxy
你应该按照下面,
How do you utilize proxy support with the python web-scraping framework Scrapy?
From the Scrapy FAQ,
Does Scrapy work with HTTP proxies?
Yes. Support for HTTP proxies is provided (since Scrapy 0.8) through the HTTP Proxy downloader middleware. See
HttpProxyMiddleware
.
The easiest way to use a proxy is to set the environment variable http_proxy
. How this is done depends on your shell.
C:\>set http_proxy=http://proxy:port csh% setenv http_proxy http://proxy:port sh$ export http_proxy=http://proxy:port
if you want to use https proxy and visited https web,to set the environment variable http_proxy
you should follow below,
C:\>set https_proxy=https://proxy:port csh% setenv https_proxy https://proxy:port sh$ export https_proxy=https://proxy:port
这篇关于Scrapy 和代理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!