通过代理使用 urllib2 [英] Using urllib2 via proxy
问题描述
我正在尝试通过代理使用 urllib2
;但是,在尝试使用 urllib2
传递我的验证详细信息的几乎所有变体之后,我要么收到一个永远挂起且不返回任何内容的请求,要么收到 407 错误
.我可以使用我的浏览器连接到网络,该浏览器连接到 prox-pac 并相应地重定向;但是,我似乎无法通过命令行 curl
、wget
、urllib2
等执行任何操作,即使我使用prox-pac 重定向到.我尝试使用 urllib2
将我的代理设置为 pac 文件中的所有代理,但均无效.
I am trying to use urllib2
through a proxy; however, after trying just about every variation of passing my verification details using urllib2
, I either get a request that hangs forever and returns nothing or I get 407 Errors
. I can connect to the web fine using my browser which connects to a prox-pac and redirects accordingly; however, I can't seem to do anything via the command line curl
, wget
, urllib2
etc. even if I use the proxies that the prox-pac redirects to. I tried setting my proxy to all of the proxies from the pac-file using urllib2
, none of which work.
我当前的脚本如下所示:
My current script looks like this:
import urllib2 as url
proxy = url.ProxyHandler({'http': 'username:password@my.proxy:8080'})
auth = url.HTTPBasicAuthHandler()
opener = url.build_opener(proxy, auth, url.HTTPHandler)
url.install_opener(opener)
url.urlopen("http://www.google.com/")
抛出 HTTP Error 407: Proxy Authentication Required
我也试过:
import urllib2 as url
handlePass = url.HTTPPasswordMgrWithDefaultRealm()
handlePass.add_password(None, "http://my.proxy:8080", "username", "password")
auth_handler = url.HTTPBasicAuthHandler(handlePass)
opener = url.build_opener(auth_handler)
url.install_opener(opener)
url.urlopen("http://www.google.com")
像 curl
或 wget
超时一样挂起.
which hangs like curl
or wget
timing out.
我需要做什么来诊断问题?我怎么可能通过我的浏览器连接而不是从同一台计算机上的命令行使用看起来相同的代理和凭据?
What do I need to do to diagnose the problem? How is it possible that I can connect via my browser but not from the command line on the same computer using what would appear to be the same proxy and credentials?
会不会跟路由器有关?如果有,如何区分浏览器HTTP
请求和命令行HTTP
请求?
Might it be something to do with the router? if so, how can it distinguish between browser HTTP
requests and command line HTTP
requests?
推荐答案
像这样的挫折促使我使用 请求.如果您正在使用 urllib2 进行大量工作,那么您真的应该检查一下.例如,要使用请求做您想做的事情,您可以这样写:
Frustrations like this are what drove me to use Requests. If you're doing significant amounts of work with urllib2, you really ought to check it out. For example, to do what you wish to do using Requests, you could write:
import requests
from requests.auth import HTTPProxyAuth
proxy = {'http': 'http://my.proxy:8080'}
auth = HTTPProxyAuth('username', 'password')
r = requests.get('http://wwww.google.com/', proxies=proxy, auth=auth)
print r.text
或者你可以将它包装在一个 Session 对象中,每个请求都会自动使用代理信息(而且它会自动存储和处理 cookie!):
Or you could wrap it in a Session object and every request will automatically use the proxy information (plus it will store & handle cookies automatically!):
s = requests.Session(proxies=proxy, auth=auth)
r = s.get('http://www.google.com/')
print r.text
这篇关于通过代理使用 urllib2的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!