Python Mechanize无法打开这些网站 [英] Python Mechanize won't open these sites

查看：71 发布时间：2020/5/8 1:02:17 python mechanize

本文介绍了Python Mechanize无法打开这些网站的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Python的Mechanize模块.我遇到了3个不同的站点，这些站点无法通过直接机械化来打开:

I'm working with Python's Mechanize module. I've come across 3 different sites that cannot be opened by mechanize directly:

en.wikipedia.org/wiki/Dog(新用户，发布的链接不得超过2个)
http://www.cpsc.gov/cpscpub/prerel /prhtml03/03059.html

import mechanize
br = mechanize.Browser()
br.set_handle_robots(False)

添加以下代码可以使机械化打开并解析Wikipedia文章和google搜索结果:

Adding the following code allows mechanize to open and parse the wikipedia article and the google search results:

    br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

但是，我的解决方法与CPSC.gov网站不匹配-当我尝试使用机械化浏览器打开它时，我的python死机了-甚至无法用键盘中断它.

But, my workarounds are no match for the CPSC.gov website - when I try to open it with the mechanize Browser, my python freezes - to the point where I can't even Keyboard Interrupt it.

这是怎么回事?

推荐答案

对于cpsc.gov网站而言，好像有一个刷新标头.但是，您可以通过以下方法解决该问题:

In the case of the cpsc.gov site, it looks like there's a refresh header that isn't being correctly processed by mechanize HTTPRefreshProcessor. However, you can workaround the problem as follows:

import mechanize

url = 'http://www.cpsc.gov/cpscpub/prerel/prhtml03/03059.html'
br = mechanize.Browser()
br.set_handle_refresh(False)
br.open(url)

这篇关于Python Mechanize无法打开这些网站的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python Mechanize无法打开这些网站 [英] Python Mechanize won't open these sites

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python Mechanize无法打开这些网站 [英] Python Mechanize won&#39;t open these sites

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Python Mechanize无法打开这些网站 [英] Python Mechanize won't open these sites

登录关闭