强制python mechanize/urllib2仅使用A请求? [英] Force python mechanize/urllib2 to only use A requests?

查看:91
本文介绍了强制python mechanize/urllib2仅使用A请求?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个相关的问题,但是我不知道如何将答案应用于机械化/urllib2:

Here is a related question but I could not figure out how to apply the answer to mechanize/urllib2: how to force python httplib library to use only A requests

基本上,给出以下简单代码:

Basically, given this simple code:

#!/usr/bin/python
import urllib2
print urllib2.urlopen('http://python.org/').read(100)

这导致wireshark说以下内容:

This results in wireshark saying the following:

  0.000000  10.102.0.79 -> 8.8.8.8      DNS Standard query A python.org
  0.000023  10.102.0.79 -> 8.8.8.8      DNS Standard query AAAA python.org
  0.005369      8.8.8.8 -> 10.102.0.79  DNS Standard query response A 82.94.164.162
  5.004494  10.102.0.79 -> 8.8.8.8      DNS Standard query A python.org
  5.010540      8.8.8.8 -> 10.102.0.79  DNS Standard query response A 82.94.164.162
  5.010599  10.102.0.79 -> 8.8.8.8      DNS Standard query AAAA python.org
  5.015832      8.8.8.8 -> 10.102.0.79  DNS Standard query response AAAA 2001:888:2000:d::a2

这是 5秒的延迟

我在系统的任何地方(使用USE=-ipv6编译的gentoo)都未启用IPv6,因此我认为python没有任何理由尝试IPv6查找.

I don't have IPv6 enabled anywhere in my system (gentoo compiled with USE=-ipv6) so I don't think that python has any reason to even try an IPv6 lookup.

上面提到的问题建议将套接字类型显式设置为AF_INET,这听起来不错.我不知道如何强制urllib或机械化使用我创建的任何套接字.

The above referenced question suggested explicitly setting the socket type to AF_INET which sounds great. I have no idea how to force urllib or mechanize to use any sockets that I create though.

编辑:我知道AAAA查询是一个问题,因为其他应用程序也存在延迟,并且在禁用ipv6的情况下重新编译后,问题就消失了……除了python中的问题仍会执行AAAA请求.

EDIT: I know that the AAAA queries are the issue because other apps had the delay as well and as soon as I recompiled with ipv6 disabled, the problem went away... except for in python which still performs the AAAA requests.

推荐答案

基于J.J. .

这基本上将socket.getaddrinfo(..)family参数强制为socket.AF_INET,而不是使用socket.AF_UNSPEC(零,这似乎是在socket.create_connection中使用的值),不仅用于来自urllib2的呼叫,而且应该对所有对socket.getaddrinfo(..)的调用执行此操作:

This basically forces the family parameter of socket.getaddrinfo(..) to socket.AF_INET instead of using socket.AF_UNSPEC (zero, which is what seems to be used in socket.create_connection), not only for calls from urllib2 but should do it for all calls to socket.getaddrinfo(..):

#--------------------
# do this once at program startup
#--------------------
import socket
origGetAddrInfo = socket.getaddrinfo

def getAddrInfoWrapper(host, port, family=0, socktype=0, proto=0, flags=0):
    return origGetAddrInfo(host, port, socket.AF_INET, socktype, proto, flags)

# replace the original socket.getaddrinfo by our version
socket.getaddrinfo = getAddrInfoWrapper

#--------------------
import urllib2

print urllib2.urlopen("http://python.org/").read(100)

至少在这种简单情况下,这对我有用.

This works for me at least in this simple case.

这篇关于强制python mechanize/urllib2仅使用A请求?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆