机械化Python和addheader方法-如何知道最新的标头? [英] Mechanize Python and addheader method - how do I know the newest headers?

查看:84
本文介绍了机械化Python和addheader方法-如何知道最新的标头?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前,我正在像这样使用机械化:

Currently, I'm using mechanize like this:

        browser = mechanize.Browser()
        browser.set_handle_robots(False)
        browser.set_handle_equiv(False)
        browser.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]

但是,操作系统和浏览器得到更新,并且我假设此标头:Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1也应被更新.

However, operating systems and browsers get updated and I assume that this header: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1 should be updated as well.

是否有任何模板,方法来构建此类标头字符串?在哪里可以找到用于构建此类标头的最新可用值?

Is there any template, method of building such header string? Where can I find the newest available values to build such header?

推荐答案

为什么在Fake-Header中始终需要最新的Useragent?在大多数情况下,网站不会阻止您使用较旧的浏览器.因此,不时地进行更新就足够了(或根本不进行更新.通常它足以在UA字符串的前面添加"Mozilla"以在浏览器获取响应时得到响应.)

Why do you need always the newest Useragent in your Fake-Header? Sites will not block you for using an older browser in most cases. So it would be sufficient to update from time to time (or not at all. Often its enough to add "Mozilla" to the front of the UA-string to get a response as a browser gets).

另一个答案是,如果您正在运行网络服务器,请从您的http日志中获取一些随机(非机器人)字符串.

Another answer would be if you have a webserver running, get some random (non-bot) string from your http-logs.

这篇关于机械化Python和addheader方法-如何知道最新的标头?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆