urllib2 返回浏览器的不同页面? [英] urllib2 returns a different page the browser does?

查看：61 发布时间：2021/7/17 18:44:07 python screen-scraping urllib2

本文介绍了urllib2 返回浏览器的不同页面?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试抓取一个页面(我的路由器的管理页面)，但该设备似乎为 urllib2 提供了与我的浏览器不同的页面.有没有人发现过这个?我该如何解决?

这是我正在使用的代码:

<预><代码>>>>从 BeautifulSoup 导入 BeautifulSoup>>>导入 urllib2>>>page = urllib2.urlopen("http://192.168.1.254/index.cgi?active_page=9133&active_page_str=page_bt_home&req_mode=0&mimic_button_field=btn_tab_goto:+9133..&request_30=7)">>>汤 = BeautifulSoup(页面)>>>汤.美化()

(html输出被markdown删除)

解决方案

使用 firebug 观察发送到服务器的标题和 cookie.然后用 urllib2.Request 和 cookielib 模拟相同的请求.

您也可以使用机械化.

I'm trying to scrape a page (my router's admin page) but the device seems to be serving a different page to urllib2 than to my browser. has anyone found this before? How can I get around it?

this the code I'm using:

>>> from BeautifulSoup import BeautifulSoup
>>> import urllib2
>>> page = urllib2.urlopen("http://192.168.1.254/index.cgi?active_page=9133&active_page_str=page_bt_home&req_mode=0&mimic_button_field=btn_tab_goto:+9133..&request_id=36590071&button_value=9133")
>>> soup = BeautifulSoup(page)
>>> soup.prettify()

(html output is removed by markdown)

解决方案

With firebug watch what headers and cookies are sent to server. Then with urllib2.Request and cookielib emulate the same request.

EDIT: Also you can use mechanize.

这篇关于urllib2 返回浏览器的不同页面?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

urllib2 返回浏览器的不同页面? [英] urllib2 returns a different page the browser does?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

urllib2 返回浏览器的不同页面? [英] urllib2 returns a different page the browser does?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭