Python urllib2 强制 IPv4 [英] Python urllib2 force IPv4
问题描述
我正在使用 python 运行一个脚本,该脚本使用 urllib2 从天气 api 获取数据并将其显示在屏幕上.我遇到的问题是,当我查询服务器时,出现没有与主机名关联的地址"错误.我可以使用 Web 浏览器查看 api 的输出,我可以使用 wget 下载文件,但我必须强制 IPv4 才能使其工作.使用 urllib2.urlopen 时是否可以在 urllib2 中强制使用 IPv4?
I am running a script using python that uses urllib2 to grab data from a weather api and display it on screen. I have had the problem that when I query the server, I get a "no address associated with hostname" error. I can view the output of the api with a web browser, and I can download the file with wget, but I have to force IPv4 to get it to work. Is it possible to force IPv4 in urllib2 when using urllib2.urlopen?
推荐答案
不是直接的,不是.
那么,你能做什么?
一种可能性是自己将主机名显式解析为 IPv4,然后使用 IPv4 地址而不是名称作为主机.例如:
One possibility is to explicitly resolve the hostname to IPv4 yourself, and then use the IPv4 address instead of the name as the host. For example:
host = socket.gethostbyname('example.com')
page = urllib2.urlopen('http://{}/path'.format(host))
但是,某些虚拟服务器站点可能需要 Host:example.com
标头,而它们将获得 Host: 93.184.216.119
.您可以通过覆盖标题来解决这个问题:
However, some virtual-server sites may require a Host: example.com
header, and they will instead get a Host: 93.184.216.119
. You can work around that by overriding the header:
host = socket.gethostbyname('example.com')
request = urllib2.Request('http://{}/path'.format(host),
headers = {'Host': 'example.com'})
page = urllib2.urlopen(request)
<小时>
或者,您可以提供您自己的处理程序来代替标准的处理程序.但标准处理程序大多只是围绕 httplib.HTTPConnection
的一个包装器,而真正的问题在于 HTTPConnection.connect
.
Alternatively, you can provide your own handlers in place of the standard ones. But the standard handler is mostly just a wrapper around httplib.HTTPConnection
, and the real problem is in HTTPConnection.connect
.
所以,干净的方法是创建您自己的 httplib.HTTPConnection
子类,它像这样覆盖 connect
:
So, the clean way to do this is to create your own subclass of httplib.HTTPConnection
, which overrides connect
like this:
def connect(self):
host = socket.gethostbyname(self.host)
self.sock = socket.create_connection((host, self.post),
self.timeout, self.source_address)
if self._tunnel_host:
self._tunnel()
然后创建您自己的 urllib2.HTTPHandler
子类,覆盖 http_open
以使用您的子类:
Then create your own subclass of urllib2.HTTPHandler
that overrides http_open
to use your subclass:
def http_open(self, req):
return self.do_open(my wrapper.MyHTTPConnection, req)
... 与 HTTPSHandler
类似,然后按照 urllib2
文档中所示正确连接所有内容.
… and similarly for HTTPSHandler
, and then hook up all the stuff properly as shown in the urllib2
docs.
快速&做同样事情的肮脏方法是将 httplib.HTTPConnection.connect
猴子补丁到上述函数.
The quick & dirty way to do the same thing is to just monkeypatch httplib.HTTPConnection.connect
to the above function.
最后,您可以使用不同的库来代替 urllib2
.据我所知,requests
并没有使这变得更容易(最终,您必须覆盖或猴子补丁略有不同的方法,但实际上是相同的).但是,任何 libcurl
包装器都将允许您执行与 curl_easy_setopt(h, CURLOPT_IPRESOLVE, CURLOPT_IPRESOLVE_V4)
等效的操作.
Finally, you could use a different library instead of urllib2
. From what I remember, requests
doesn't make this any easier (ultimately, you have to override or monkeypatch slightly different methods, but it's effectively the same). However, any libcurl
wrapper will allow you to do the equivalent of curl_easy_setopt(h, CURLOPT_IPRESOLVE, CURLOPT_IPRESOLVE_V4)
.
这篇关于Python urllib2 强制 IPv4的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!