在python机械化中更改链接 [英] Changing the link in python mechanize
问题描述
我正在尝试编写一个python脚本,该脚本将生成我的批处理的等级列表.为此,我只需要使用Web浏览器中的inspect元素功能来更改链接的roll-number参数.链接(相对)看起来像:
I am trying to write a python script that will generate the rank-list of my batch. For this I simply need to change the roll-number parameter of the link using inspect element feature in web-browser. The link(relative) looks something like:
/academic/utility/AcademicRecord.jsp?loginCode=000&loginnumber=000&loginName=name&Home=ascwebsite
我只需要更改loginCode即可获得我的同班同学的成绩.我正在尝试使用python遍历所有的滚动数并生成一个排名列表.我使用机械化库使用python打开网站.代码的相关部分:
I just need to change the loginCode to get the grade of my batch-mates. I am trying to use python to iterate through all the roll-numbers and generate a rank-list. I used mechanize library to open the site using python. The relevant portion of code:
br = mechanize.Browser()
br.set_handle_robots(False)
response = br.open('link_to_the_page')
然后我进行必要的身份验证,并导航到查看成绩的链接所在的相应页面.
然后我找到这样的相关链接:
I then do the necessary authentication and navigate to the appropriate page where the link to view the grades reside.
Then I find the relevant link like this:
for link in br.links(url_regex='/academic/utility/AcademicRecord.jsp?'):
现在在其中,我可以适当地更改链接的URL和属性. 然后,我使用以下方法打开链接:
Now inside this I change the url and attributes of the link appropriately. And then I open the link using:
response=br.follow_link(link)
print response.read()
但是它不起作用.它将打开相同的链接,即具有初始卷号.实际上,我尝试将链接的URL更改为非常不同的名称,例如 http://www.google.com .
But it does not work. It opens the same link i.e. with the initial roll number. In fact I tried changing the url of the link to something very different like http://www.google.com.
link.url='http://www.google.com'
link.base_url='http://www.google.com'
它仍会打开相同的页面,而不是Google的页面. 任何帮助将不胜感激.
It still opens the same page and not google's page. Any help would be highly appreciated.
推荐答案
根据源代码,follow_link()
和click_link()
使用链接的absolute_url
属性,该属性是在
According to the source code, follow_link()
and click_link()
use link's absolute_url
property that is set during the link initialization. And, you are setting only url
and base_url
properties.
解决方案是在循环中更改链接的absolute_url
:
The solution would be to change the absolute_url
of a link in the loop:
BASE_URL = 'link_to_the_page'
for link in br.links(url_regex='/academic/utility/AcademicRecord.jsp?'):
modified_link = ...
link.absolute_url = mechanize.urljoin(BASE_URL, modified_link)
br.follow_link(link)
希望有帮助.
这篇关于在python机械化中更改链接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!