发布数据时无法筛选刮刮ASP.Net网站 [英] Failed to screen scrape ASP.Net website while posting data

查看:76
本文介绍了发布数据时无法筛选刮刮ASP.Net网站的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试筛选抓取在ASP.NET上构建的网站时,收到无效的回发或回调参数错误.

Getting Invalid postback or callback argument error while trying to screen scrape a website which has build on ASP.NET.

登陆页面的第一请求没有问题.当我更改下拉字段值之一后发布表单数据时,这引发了异常.

Fist request of landing page has no issue. It's raising exception when I posts form data after changing one of drop-down field value.

"""
Invalid postback or callback argument.  Event validation is enabled using
<pages enableEventValidation="true"/> in configuration or <%@ Page
EnableEventValidation="true" %> in a page.  For security purposes, this feature
verifies that arguments to postback or callback events originate from the server
control that originally rendered them.  If the data is valid and expected, use
the ClientScriptManager.RegisterForEventValidation method in order to register
the postback or callback data for validation. 
"""

这是我的尝试:

#!/bin/env python
import sys
import requests
from bs4 import BeautifulSoup

HOST = 'forms.toyotabharat.com'
URL = 'http://%s/pricelist-dealer.aspx' % HOST
HEADERS = {
    'Host': HOST,
    'Origin': 'http://%s' % HOST,
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.5',
    'Accept-Encoding': 'gzip, deflate',
    'Connection': 'keep-alive'
}

session = requests.Session()

r = session.get(URL, headers=HEADERS)

if r.status_code != requests.codes.ok:
    sys.exit()

soup = BeautifulSoup(r.content)

# ASP validation and session fields
view_state = soup.select("#__VIEWSTATE")[0]['value']
view_state_generator = soup.select("#__VIEWSTATEGENERATOR")[0]['value']
event_validation = soup.select("#__EVENTVALIDATION")[0]['value']

FORM_FIELDS = {
    '__EVENTTARGET': 'cboState',
    '__EVENTARGUMENT': '',
    '__LASTFOCUS': '',
    '__VIEWSTATE': view_state,
    '__VIEWSTATEGENERATOR': view_state_generator,
    '__EVENTVALIDATION': event_validation,
    'cboState': '3',
    'cboCity': '-1',
    'hdDealerMaps': 'True',
}

# POST form fields
r = session.post(URL, data=FORM_FIELDS, headers=HEADERS, cookies=r.cookies.get_dict())

if r.status_code != requests.codes.ok:
    print "Failed with status_code %d" % r.status_code
    sys.exit()

soup = BeautifulSoup(r.content)

推荐答案

您基本上位于正确的位置.我通过一些更改使它运行.

You where basically on the right track. I got it running with a few changes.

无效的回发或回调参数.

Invalid postback or callback argument.

该错误消息确实很有帮助.如果您阅读"msdn页面" 里面有一个提示.

The error message is really helpful. If you read the msdn page of it there is a hint.

总结为:不要发布不是您通过 GET

Summarized as: don't post parameters or values which are not in the form which you get with GET

在您的情况下意味着,如果选择一个State,则它应该是 cboState select元素中的值之一.(例如2是无效值)

In your case means that that if you select a State it should be one of the values from cboState select element. (For example 2 is not a valid value)

但这在您的示例中是正确的,因此第二点是不要在您的发布请求中包含无效的参数.这意味着在此示例中,当您发布到 __ EVENTTARGET cboState时,不应添加 cboCity .

But this is right in your example so the second point is not to have parameters in your post request which are not valid. Meaning in this example you shouldn't add cboCity when you post to __EVENTTARGET cboState.

长话短说,您需要使用以下表单字段:

Long story short you need to use this form fields:

FORM_FIELDS = {
    '__EVENTTARGET': 'cboState',
    '__EVENTARGUMENT': '',
    '__LASTFOCUS': '',
    '__VIEWSTATE': view_state,
    '__VIEWSTATEGENERATOR': view_state_generator,
    '__EVENTVALIDATION': event_validation,
    'cboState': '3',
    'hdDealerMaps': 'True',
}

脚本的更新版本: https://gist.github.com/fliiiix/ea365b96f5ab4ec4d345

这篇关于发布数据时无法筛选刮刮ASP.Net网站的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆