如何在 Python 中通过代理打开带有 urllib 的网站? [英] How can I open a website with urllib via proxy in Python?

查看:38
本文介绍了如何在 Python 中通过代理打开带有 urllib 的网站?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个检查网站的程序,我想知道如何通过 Python 中的代理来检查它...

I have this program that check a website, and I want to know how can I check it via proxy in Python...

这是代码,只是举例

while True:
    try:
        h = urllib.urlopen(website)
        break
    except:
        print '['+time.strftime('%Y/%m/%d %H:%M:%S')+'] '+'ERROR. Trying again in a few seconds...'
        time.sleep(5)

推荐答案

默认情况下,urlopen 使用环境变量 http_proxy 来确定使用哪个 HTTP 代理:

By default, urlopen uses the environment variable http_proxy to determine which HTTP proxy to use:

$ export http_proxy='http://myproxy.example.com:1234'
$ python myscript.py  # Using http://myproxy.example.com:1234 as a proxy

如果你想在你的应用程序中指定一个代理,你可以给 urlopen 一个 proxys 参数:

If you instead want to specify a proxy inside your application, you can give a proxies argument to urlopen:

proxies = {'http': 'http://myproxy.example.com:1234'}
print("Using HTTP proxy %s" % proxies['http'])
urllib.urlopen("http://www.google.com", proxies=proxies)

如果我正确理解您的评论,您想尝试多个代理并在尝试时打印每个代理.这样的事情怎么样?

If I understand your comments correctly, you want to try several proxies and print each proxy as you try it. How about something like this?

candidate_proxies = ['http://proxy1.example.com:1234',
                     'http://proxy2.example.com:1234',
                     'http://proxy3.example.com:1234']
for proxy in candidate_proxies:
    print("Trying HTTP proxy %s" % proxy)
    try:
        result = urllib.urlopen("http://www.google.com", proxies={'http': proxy})
        print("Got URL using proxy %s" % proxy)
        break
    except:
        print("Trying next proxy in 5 seconds")
        time.sleep(5)

这篇关于如何在 Python 中通过代理打开带有 urllib 的网站?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆