使用Python获取Wikipedia文章 [英] Fetch a Wikipedia article with Python

查看：179 发布时间：2020/11/25 19:24:51 python urllib2 user-agent wikipedia http-status-code-403

本文介绍了使用Python获取Wikipedia文章的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我尝试使用Python的urllib获取Wikipedia文章:

I try to fetch a Wikipedia article with Python's urllib:

f = urllib.urlopen("http://en.wikipedia.org/w/index.php?title=Albert_Einstein&printable=yes")           
s = f.read()
f.close()

但是，我得到的不是HTML页面，而是以下响应:错误-Wikimedia Foundation:

However instead of the html page I get the following response: Error - Wikimedia Foundation:

Request: GET http://en.wikipedia.org/w/index.php?title=Albert_Einstein&printable=yes, from 192.35.17.11 via knsq1.knams.wikimedia.org (squid/2.6.STABLE21) to ()
Error: ERR_ACCESS_DENIED, errno [No Error] at Tue, 23 Sep 2008 09:09:08 GMT

维基百科似乎阻止了不是来自标准浏览器的请求.

Wikipedia seems to block request which are not from a standard browser.

有人知道如何解决这个问题吗?

Anybody know how to work around this?

推荐答案

您需要使用 urllib 在 python std库以更改用户代理.

You need to use the urllib2 that superseedes urllib in the python std library in order to change the user agent.

直接从例子

import urllib2
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
infile = opener.open('http://en.wikipedia.org/w/index.php?title=Albert_Einstein&printable=yes')
page = infile.read()

这篇关于使用Python获取Wikipedia文章的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用Python获取Wikipedia文章 [英] Fetch a Wikipedia article with Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用Python获取Wikipedia文章 [英] Fetch a Wikipedia article with Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭