Python的 - 请求/ RoboBrowser - ASPX POST的JavaScript [英] Python - Requests/RoboBrowser - ASPX POST JavaScript
问题描述
我移植使用卷曲和岗位的有效载荷在code到URL的作品和一个bash脚本。基本问题是,随着robobrowser,我遇到了麻烦发布使用页面表单。
I am porting a bash script that uses curl and POSTs the payloads in the code to the URL's and works. The basic issue is that, with robobrowser, I'm running into trouble posting using the page forms.
通过该网站步进:
- 登录/SubLogin.aspx
- 成功登录重定向/OptionsSummary.aspx
- 使用参数GET /FindMe.aspx
- POST /FindMe.aspx按钮电话列表(然后页面应该载入电话列表表一项目工作)
- 选择工作项目执行后,以/PhoneLists.aspx(那么这应该装载表工作与用户列表。
我已经能够成功地验证到网站,并执行与两个RoboBrowser和请求+ BS4但是GET操作
我很困惑上发布回自己的网页。
I have been able to successfully authenticate to the site and perform GETs with both RoboBrowser and Requests+bs4 however I'm confused on POSTing back to the pages themselves.
使用RoboBrowser(liboncall.py)
Using RoboBrowser (liboncall.py)
#!/usr/bin/python
from robobrowser import RoboBrowser
from bs4 import BeautifulSoup as BS
oc_mailbox = '123456'
oc_password_hashed = 'ABCDEFG'
base_uri = 'http://example.com'
auth_uri = oc_base_uri + '/SubLogin.aspx'
find_uri = oc_base_uri + '/FindMe.aspx'
phne_uri = oc_base_uri + '/PhoneLists.aspx'
p_auth_payload = {
'SubLoginControl:javascriptTest': 'true',
'SubLoginControl:mailbox': mailbox,
'SubLoginControl:phoneNumber': '',
'SubLoginControl:password': password_hashed,
'SubLoginControl:btnLogOn': 'Logon',
'SubLoginControl:webLanguage': 'en-US',
'SubLoginControl:initialLanguage': 'en-US',
'SubLoginControl:errorCallBackNumber': 'Entered telephone number contains non-dialable characters.',
'SubLoginControl:cookieMailbox': 'mailbox',
'SubLoginControl:cookieCallbackNumber': 'callbackNumber',
'SubLoginControl:serverDomain': ''
}
p_find_payload = {
'FindMeControl:enableFindMe': 'on',
'FindMeControl:MasterDataControl:focusElement': '',
'FindMeControl:MasterDataControl:masterList:_ctl0:enabled': 'on',
'FindMeControl:MasterDataControl:masterList:_ctl0:itemGuid': '',
'FindMeControl:MasterDataControl:hidSelectedScheduleName': '',
'FindMeControl:MasterDataControl:hidbtnStatus': '',
'FindMeControl:MasterDataControl:hidScheduleXML': '',
'FindMeControl:MasterDataControl:tempScheduleXML': '',
'FindMeControl:MasterDataControl:hidSelectedScheduleGUID': '',
'FindMeControl:MasterDataControl:hidChangedScheduleList': '',
'FindMeControl:btnPhoneLists': 'Phone Lists',
'FindMeControl:enableFindMeHidden': '',
'FindMeControl:applySet': 'false'
}
p_phne_payload = {
'__EVENTARGUMENT': '',
'__EVENTTARGET': 'PhoneListsControl$MasterDataControl$masterList$_ctl0$SelectButton',
'PhoneListsControl:MasterDataControl:focusElement': '',
'PhoneListsControl:MasterDataControl:masterList:_ctl0:itemGuid': '',
'PhoneListsControl:MasterDataControl:hidSelectedScheduleName': '',
'PhoneListsControl:MasterDataControl:hidbtnStatus': '',
'PhoneListsControl:MasterDataControl:hidScheduleXML': '',
'PhoneListsControl:MasterDataControl:tempScheduleXML': '',
'PhoneListsControl:MasterDataControl:hidSelectedScheduleGUID': '',
'PhoneListsControl:MasterDataControl:hidChangedScheduleList': '',
'PhoneListsControl:applySet': 'false'
}
def auth(mailbox, password):
browser = RoboBrowser(history=False)
browser.open(oc_auth_uri)
signin = browser.get_form(id='aspnetForm')
signin['SubLoginControl:mailbox'].value = mailbox
signin['SubLoginControl:password'].value = password
signin['SubLoginControl:javascriptTest'].value = 'true'
signin['SubLoginControl:btnLogOn'].value = 'Logon'
signin['SubLoginControl:webLanguage'].value = 'en-US'
signin['SubLoginControl:initialLanguage'].value = 'en-US'
signin['SubLoginControl:errorCallBackNumber'].value = 'Entered+telephone+number+contains+non-dialable+characters.'
signin['SubLoginControl:cookieMailbox'].value = 'mailbox'
signin['SubLoginControl:cookieCallbackNumber'].value = 'callbackNumber'
signin['SubLoginControl:serverDomain'].value = ''
browser.submit_form(signin)
return browser
登录网站,并显示URL来验证我们的:
Login to site and show URL to verify we're in:
In [20]: from liboncall import *
In [21]: m = auth(oc_mailbox, oc_password_hashed)
In [22]: m.url
Out[22]: u'http://example.com/OptionsSummary.aspx'
打开/FindMe.aspx
Open "/FindMe.aspx":
In [24]: m.open(find_uri)
In [25]: m.url
Out[25]: u'http://example.com/FindMe.aspx'
最初/FindMe.aspx将加载一个表单和一个按钮电话列表( FindMeControl:btnPhoneLists
)。
In [26]: m.select('title')
Out[26]: [<title>Find Me</title>]
In [27]: form_find_a = m.get_form(action="FindMe.aspx")
In [28]: for i in form_find_a.keys():
print(i)
....:
__VIEWSTATE
__EVENTVALIDATION
FindMeControl:enableFindMe
FindMeControl:MasterDataControl:focusElement
FindMeControl:MasterDataControl:masterList:_ctl0:enabled
FindMeControl:MasterDataControl:masterList:_ctl0:itemGuid
FindMeControl:MasterDataControl:btnAdd
FindMeControl:MasterDataControl:btnDelete
FindMeControl:MasterDataControl:btnRename
FindMeControl:MasterDataControl:btnCancel
FindMeControl:MasterDataControl:btnEnter
FindMeControl:MasterDataControl:btnUpdate
FindMeControl:MasterDataControl:hidSelectedScheduleName
FindMeControl:MasterDataControl:hidbtnStatus
FindMeControl:MasterDataControl:hidScheduleXML
FindMeControl:MasterDataControl:tempScheduleXML
FindMeControl:MasterDataControl:hidSelectedScheduleGUID
FindMeControl:MasterDataControl:hidChangedScheduleList
FindMeControl:btnApply
FindMeControl:btnSchedules
FindMeControl:btnPhoneLists
FindMeControl:enableFindMeHidden
FindMeControl:applySet
删除非必要表单域,填写表格并提交:
Remove un-needed form fields, fill out form and submit:
In [29]: find_remove = (
'FindMeControl:MasterDataControl:btnAdd',
'FindMeControl:MasterDataControl:btnDelete',
'FindMeControl:MasterDataControl:btnRename',
'FindMeControl:MasterDataControl:btnCancel',
'FindMeControl:MasterDataControl:btnEnter',
'FindMeControl:MasterDataControl:btnUpdate',
'FindMeControl:btnApply',
'FindMeControl:btnSchedules')
In [30]: for i in find_remove:
form_find_a.fields.pop(i)
In [31]: form_find_a['FindMeControl:enableFindMe'].value = 'on'
form_find_a['FindMeControl:MasterDataControl:focusElement'].value = ''
form_find_a['FindMeControl:MasterDataControl:masterList:_ctl0:enabled'].value = 'on'
form_find_a['FindMeControl:MasterDataControl:masterList:_ctl0:itemGuid'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidSelectedScheduleName'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidbtnStatus'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidScheduleXML'].value = ''
form_find_a['FindMeControl:MasterDataControl:tempScheduleXML'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidSelectedScheduleGUID'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidChangedScheduleList'].value = ''
form_find_a['FindMeControl:btnPhoneLists'].value = 'Phone Lists'
form_find_a['FindMeControl:enableFindMeHidden'].value = ''
form_find_a['FindMeControl:applySet'].value = 'false'
Out [31]: ...
In [32]: m.submit_form(form_find_a)
验证该页面已经更新,并具有列表项工作:
Verifying that page has updated and has the list item "Work":
In [33]: m.parsed.find('title')
Out[33]: <title>Phone Lists</title>
In [34]: m.parsed.find('a', id='PhoneListsControl_MasterDataControl_masterList__ctl0_SelectButton')
Out[34]: <a class="linkButtonItem" href="javascript:__doPostBack('PhoneListsControl$MasterDataControl$masterList$_ctl0$SelectButton','')" id="PhoneListsControl_MasterDataControl_masterList__ctl0_SelectButton" onclick="javascript:onClick();">Work</a>
获得PhoneLists.aspx的形式,除去未所需的字段,填写并提交。
Get the "PhoneLists.aspx" form, remove un-needed fields, fill out and submit.
In [35]: form_find_b = m.get_form(action='PhoneLists.aspx')
In [36]: phne_remove = (
'PhoneListsControl:MasterDataControl:btnAdd',
'PhoneListsControl:MasterDataControl:btnDelete',
'PhoneListsControl:MasterDataControl:btnRename',
'PhoneListsControl:MasterDataControl:btnCancel',
'PhoneListsControl:MasterDataControl:btnEnter',
'PhoneListsControl:MasterDataControl:btnUpdate',
'PhoneListsControl:btnApply',
'PhoneListsControl:btnBack')
In [37]: for i in phne_remove:
form_find_b.fields.pop(i)
In [38]: form_find_b['PhoneListsControl:MasterDataControl:focusElement'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidChangedScheduleList'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidScheduleXML'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidSelectedScheduleGUID'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidSelectedScheduleName'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidbtnStatus'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:masterList:_ctl0:itemGuid'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:tempScheduleXML'].value = ''
form_find_b['PhoneListsControl:applySet'].value = 'false'
In [39]: m.submit_form(form_find_b)
检查后,看看用户列表加载。在这种情况下,它没有加载
用户列表。
Review the post to see if user list loaded. In this instance, it did not load the user list.
In [40]: m.parsed.findAll('div', id='PhoneListsControl_phoneListMembersText')
Out[41]: [<div class="displayText" id="PhoneListsControl_phoneListMembersText"></div>]
的如果的是全成以上将返回:
If it was successfull the above would return:
<div id="PhoneListsControl_phoneListMembersText" class="displayText" style="top: 315px; left: 281px;"> Work </div>
随着表中的下列项目,( PhoneListsControl_phoneListDetail
)
Along with the following items in a table, (PhoneListsControl_phoneListDetail
):
<input name="PhoneListsControl:phoneListDetail:_ctl2:number" type="text" value="95551234567" maxlength="50" id="PhoneListsControl_phoneListDetail__ctl2_number" onkeyup="enableApplyButton('PhoneListsControl_')" style="width:140px;">
...
<input name="PhoneListsControl:phoneListDetail:_ctl3:number" type="text" value="95551236789" maxlength="50" id="PhoneListsControl_phoneListDetail__ctl2_number" onkeyup="enableApplyButton('PhoneListsControl_')" style="width:140px;">
...
在这家合资公司我想通了,Robobrowser不包括所有必要的
FORMDATA该职位为PhoneLists.aspx按预期方式工作,('__ EVENTTARGET':'PhoneListsControl $ MasterDataControl $ masterList $ _ctl0 $ SelectButton
和 __ EVENTARGUMENT
)。设置PARAMS,然后做 submit_form(form_find_b)
并没有达到预期的效果无论是。我不知道 add_field()
从 robobrowser.forms.form
的工作,但我不理解如何正确利用它,(如果是因为我想在所有使用。例如,添加 __ EVENTTARGET
和 __ EVENTARGUMENT
隐藏输入字段的形式)。
At this venture I figured out that Robobrowser isn't including all the required
formdata for the post to "PhoneLists.aspx" to work as expected, ('__EVENTTARGET':'PhoneListsControl$MasterDataControl$masterList$_ctl0$SelectButton'
and __EVENTARGUMENT
). Setting the params and then doing submit_form(form_find_b)
does not achieve desired results either. I wonder if the add_field()
from robobrowser.forms.form
would work but I'm not understanding how to properly utilize it, (if it is to be used at all as I wanted. e.g. Add the __EVENTTARGET
and __EVENTARGUMENT
hidden input fields to the form).
有没有别的我丢失或不RoboBrowser /请求不支持这种类型的职位的东西吗?
难道形式需要Javascript这里执行提到随着机械化?
Is there something else I am missing or does RoboBrowser/Requests not support this type of post? Is it that the form requires javascript to execute as mentioned here with mechanize?
推荐答案
很多google搜索,再张贴上 reddit的后帮然后随机跌跌撞撞这个的RoboBrowser的问题,教我如何正确使用fields.add_field ()' 方法;的问题得到解决。
Solved
After much googling, re-posting for help on reddit and then randomly stumbling this RoboBrowser issue that showed me how to properly use the 'fields.add_field()' method; the problem is solved.
例如
b_e_arg = robobrowser.forms.fields.Input('\<input name="__EVENTARGUMENT" value="" \/\>')
b_e_target = robobrowser.forms.fields.Input('\<input name="__EVENTTARGET" value="PhoneListsControl$MasterDataControl$masterList$_ctl0$SelectButton" \/\>')
In [30]: form_find_b.add_field(b_e_target)
In [31]: form_find_b.add_field(b_e_arg)
一旦形式使用这些值更新的形式提交给PhoneLists.aspx按预期工作。
Once the form was updated with these values, the form submit to "PhoneLists.aspx" works as expected.
In [33]: m.submit_form(form_find_b)
In [34]: m.url
Out[34]: u'http://example/PhoneLists.aspx'
In [35]: m.parsed.findAll('div', id='PhoneListsControl_phoneListMembersText')
Out[35]: [<div class="displayText" id="PhoneListsControl_phoneListMembersText"> Work </div>]
In [36]: m.parsed.findAll('input', id='PhoneListsControl_phoneListDetail__ctl2_number')
Out[36]: [<input id="PhoneListsControl_phoneListDetail__ctl2_number" maxlength="50" name="PhoneListsControl:phoneListDetail:_ctl2:number" onkeyup="enableApplyButton('PhoneListsControl_')" type="text" value="95551234567"/>]
我希望其他人有刮ASPX网站发现这个有用。快乐黑客大家!
I hope anyone else that has to scrape ASPX sites finds this useful. Happy hacking to all!
这篇关于Python的 - 请求/ RoboBrowser - ASPX POST的JavaScript的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!