Python urllib 缓存 [英] Python urllib cache

查看:23
本文介绍了Python urllib 缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用 Python 编写一个脚本来确定它是否可以访问互联网.

I'm writing a script in Python that should determine if it has internet access.

import urllib

CHECK_PAGE     = "http://64.37.51.146/check.txt"
CHECK_VALUE    = "true\n"
PROXY_VALUE    = "Privoxy"
OFFLINE_VALUE  = ""

page = urllib.urlopen(CHECK_PAGE)
response = page.read()
page.close()

if response.find(PROXY_VALUE) != -1:
    urllib.getproxies = lambda x = None: {}
    page = urllib.urlopen(CHECK_PAGE)
    response = page.read()
    page.close()

if response != CHECK_VALUE:
    print "'" + response + "' != '" + CHECK_VALUE + "'" # 
else:
    print "You are online!"

我在计算机上使用代理,因此正确的代理处理很重要.如果它无法通过代理连接到互联网,它应该绕过代理并查看它是否卡在登录页面(就像我使用的许多公共热点一样).使用该代码,如果我没有连接到互联网,第一个 read() 将返回代理的错误页面.但是当我在那之后绕过代理时,我得到了相同的页面.如果我在发出任何请求之前绕过代理,我会收到一个应该的错误.我认为 Python 从第一次开始就缓存页面.

I use a proxy on my computer, so correct proxy handling is important. If it can't connect to the internet through the proxy, it should bypass the proxy and see if it's stuck at a login page (as many public hotspots I use do). With that code, if I am not connected to the internet, the first read() returns the proxy's error page. But when I bypass the proxy after that, I get the same page. If I bypass the proxy BEFORE making any requests, I get an error like I should. I think Python is caching the page from the 1st time around.

如何强制 Python 清除其缓存(或者这是其他问题)?

How do I force Python to clear its cache (or is this some other problem)?

推荐答案

你想要

page = urllib.urlopen(CHECK_PAGE, proxies={})

删除

urllib.getproxies = lambda x = None: {}

线.

这篇关于Python urllib 缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆