如何在 Python 中通过 Tor 发出 urllib2 请求? [英] How to make urllib2 requests through Tor in Python?
问题描述
我正在尝试使用 Python 编写的爬虫来爬取网站.我想将 Tor 与 Python 集成,这意味着我想使用 Tor 匿名抓取网站.
I'm trying to crawl websites using a crawler written in Python. I want to integrate Tor with Python meaning I want to crawl the site anonymously using Tor.
我试过这样做.它似乎不起作用.我检查了我的IP,它仍然与我使用tor之前的IP相同.我是通过python检查的.
I tried doing this. It doesn't seem to work. I checked my IP it is still the same as the one before I used tor. I checked it via python.
import urllib2
proxy_handler = urllib2.ProxyHandler({"tcp":"http://127.0.0.1:9050"})
opener = urllib2.build_opener(proxy_handler)
urllib2.install_opener(opener)
推荐答案
您正在尝试连接到 SOCKS 端口 - Tor 拒绝任何非 SOCKS 流量.您可以通过中间人 - Privoxy - 使用端口 8118 进行连接.
You are trying to connect to a SOCKS port - Tor rejects any non-SOCKS traffic. You can connect through a middleman - Privoxy - using Port 8118.
示例:
proxy_support = urllib2.ProxyHandler({"http" : "127.0.0.1:8118"})
opener = urllib2.build_opener(proxy_support)
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
print opener.open('http://www.google.com').read()
另外请注意传递给 ProxyHandler 的属性,ip:port 前没有 http 前缀
Also please note properties passed to ProxyHandler, no http prefixing the ip:port
这篇关于如何在 Python 中通过 Tor 发出 urllib2 请求?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!