在python机械化中更改链接 [英] Changing the link in python mechanize

查看:82
本文介绍了在python机械化中更改链接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个python脚本,该脚本将生成我的批处理的等级列表.为此,我只需要使用Web浏览器中的inspect元素功能来更改链接的roll-number参数.链接(相对)看起来像:

I am trying to write a python script that will generate the rank-list of my batch. For this I simply need to change the roll-number parameter of the link using inspect element feature in web-browser. The link(relative) looks something like:

/academic/utility/AcademicRecord.jsp?loginCode=000&loginnumber=000&loginName=name&Home=ascwebsite

我只需要更改loginCode即可获得我的同班同学的成绩.我正在尝试使用python遍历所有的滚动数并生成一个排名列表.我使用机械化库使用python打开网站.代码的相关部分:

I just need to change the loginCode to get the grade of my batch-mates. I am trying to use python to iterate through all the roll-numbers and generate a rank-list. I used mechanize library to open the site using python. The relevant portion of code:

br = mechanize.Browser()
br.set_handle_robots(False)
response = br.open('link_to_the_page')

然后我进行必要的身份验证,并导航到查看成绩的链接所在的相应页面.
然后我找到这样的相关链接:

I then do the necessary authentication and navigate to the appropriate page where the link to view the grades reside.
Then I find the relevant link like this:

for link in br.links(url_regex='/academic/utility/AcademicRecord.jsp?'):

现在在其中,我可以适当地更改链接的URL和属性. 然后,我使用以下方法打开链接:

Now inside this I change the url and attributes of the link appropriately. And then I open the link using:

response=br.follow_link(link)
print response.read()

但是它不起作用.它将打开相同的链接,即具有初始卷号.实际上,我尝试将链接的URL更改为非常不同的名称,例如 http://www.google.com .

But it does not work. It opens the same link i.e. with the initial roll number. In fact I tried changing the url of the link to something very different like http://www.google.com.

link.url='http://www.google.com'
link.base_url='http://www.google.com'

它仍会打开相同的页面,而不是Google的页面. 任何帮助将不胜感激.

It still opens the same page and not google's page. Any help would be highly appreciated.

推荐答案

根据源代码follow_link()click_link()使用链接的absolute_url属性,该属性是在

According to the source code, follow_link() and click_link() use link's absolute_url property that is set during the link initialization. And, you are setting only url and base_url properties.

解决方案是在循环中更改链接的absolute_url:

The solution would be to change the absolute_url of a link in the loop:

BASE_URL = 'link_to_the_page'
for link in br.links(url_regex='/academic/utility/AcademicRecord.jsp?'):
    modified_link = ...
    link.absolute_url = mechanize.urljoin(BASE_URL, modified_link)
    br.follow_link(link)

希望有帮助.

这篇关于在python机械化中更改链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆