scrapy FormRequest正确/错误打开/关闭复选框 [英] scrapy FormRequest True/False on/'off' Checked Boxes

查看:46
本文介绍了scrapy FormRequest正确/错误打开/关闭复选框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在类似的帖子中,有人问到有关从[ on]设置为not on,只需设置 True和 False值(使用机械化)即可。

In a similar post a question was asked about changing a form value from [on] to not on, which was simply setting a 'True' and 'False' value (using Mechanize).

这如何在草率的 FormRequest.from_response 中完成?

How would this be accomplished in scrapy FormRequest.from_response?

EDIT

例如,使用机械化获取表单信息,

这是网页表单随附的默认设置。

默认情况下,将检查表单上的所有内容:

EDIT
For example, using mechanize to get form information,
this is the default that comes with the webpage form.
By default, everything on the form is checked:

<CheckboxControl(ac=[*on])>
type=checkbox, name=ac value=['on']
<CheckboxControl(<None>=[*on])>
type=checkbox, name=None value=[]
<TextControl(p=)>
type=text, name=p value=
<CheckboxControl(pr[]=[*0, *1, *2])>
type=checkbox, name=pr[] value=['0', '1', '2']
<CheckboxControl(a[]=[*0, *1, *2, *3, *4])>
type=checkbox, name=a[] value=['0', '1', '2', '3', '4']
<CheckboxControl(pl=[*on])>
type=checkbox, name=pl value=['on']
<CheckboxControl(sp[]=[*1, *2, *3])>
type=checkbox, name=sp[] value=['1', '2', '3']
<SelectControl(pp=[0, 1, *2, 3])>
type=select, name=pp value=['2']

请注意 ac < None> 和' pl

它们的值是 [* on]

目标是将它们关闭(?)(取消选中它们)

Note the 'ac', '<None>' and 'pl'.
They have a value of [*on]
The goal is to turn them 'off'(?) (uncheck them)

FormRequest.from_response(response, formnumber=0, formdata={'pr[]': '2', 'sp[]': '3', 'pp': '3', 'a[]': ['3', '4']}))

这将返回一个表单,其中每个表单数据都有修改后的框。
仍会检查表单数据中未提及的那些键。

This returns a form with the modified boxes per the formdata. Those keys not mentioned in the formdata are still checked.

按照上述帖子中的示例:

Following the example in the above post:

FormRequest.from_response(response, formdata={'live': 'False'})

我已经使用各种值完成了FormRequest: False, True,,,, on, off和 None,但是可以似乎得到正确的回应。

I have done the FormRequest with a variety of values: 'False', 'True', '', [''], 'on', 'off' and 'None' but can't seem to get the right response.

有什么建议吗?

编辑:

有尝试:


Have attempted:

FormRequest(url, formdata = {'pl': 'False'}, callback=parse_this)  
FormRequest(url, formdata = {'pl': 'off'}, callback=parse_this)  
FormRequest(url, formdata = {'pl': ''}, callback=parse_this) 
FormRequest(url, formdata = {'pl': 'None'}, callback=parse_this)
FormRequest(url, formdata = {'pl': None}, callback=parse_this) 

FormRequest.from_response(response, formdata = {'pl': 'False'})  
FormRequest.from_response(response, formdata = {'pl': 'off'})  
FormRequest.from_response(response, formdata = {'pl': '')  

默认情况下,网页提供的表单包含已包含的复选框检查。目标是提交表单并关闭仅具有两个选项的复选框: on / off

By default, the webpage provides a form that contains checkboxes that are already checked. The goal is submit the form and 'turn off' some checkbox that only have two options: 'on'/'off'

推荐答案

复选框是一个输入字段,与其他任何字段一样,即具有 value 属性,该属性被发送到服务器。唯一的区别是,如果不选中它,则根本不发送它;如果选中,则与其他字段一起发送。我的意思是服务器通常只通过检查其名称是否在表单数据中来检查是否已选中该复选框。

Checkbox is an input field like any others, i.e. it has value attribute, which is sent to the server. The only difference is that if it is not checked, it is not sent at all, and if it is checked, it is sent along with other fields. I mean a server usually checks if a checkbox is checked by simply checking if its name is in the form data.

您要取消选中称为活动的复选框。这意味着,它根本不必发送到服务器。

You want to "uncheck" checkbox called 'live'. That means that, it just has to be NOT sent to the server at all.

我将使用 FormRequest 的子类(未经测试,但您应该知道这个主意):

I would use a subclass of FormRequest (not tested, but you should get the idea):

class MyFormRequest(FormRequest):
    """FormRequest subclass which filters from form data submitted to the server None values.
    This allows removing some fields automatically collected from a form by FormRequest.from_response method."""

    def __init__(self, *args, **kwargs):
        formdata = kwargs.get('formdata')
        if formdata: # filter out input fields with None values
            formdata = dict((name, value) for name, value in formdata.iteritems() if value is not None)
            kwargs['formdata'] = formdata

        super(MyFormRequest, self).__init__(*args, **kwargs)

然后使用 MyFormRequest.from_response 代替 FormRequest.from_response

另一个解决您问题的方法是构造 FormRequest ,仅将所需的表单数据传递给它,而不使用 FormRequest.from_response

Another option to solve you problem is constructing FormRequest manually, passing it only that form data which is needed, without using FormRequest.from_response.

此处是一个示例取消选中的复选框会发生什么情况:

Here is an example what happens with checkboxes which are unchecked:


在PHP脚本(checkbox-form.php)中,我们可以获得提交的选项$ b $ _POST数组中的$ b。如果$ _POST [’formWheelchair’]为是,则选中
框。如果未选中此复选框,则不会设置
$ _POST [’formWheelchair']。

In the PHP script (checkbox-form.php), we can get the submitted option from the $_POST array. If $_POST['formWheelchair'] is "Yes", then the box was checked. If the check box was not checked, $_POST['formWheelchair'] won't be set.

这篇关于scrapy FormRequest正确/错误打开/关闭复选框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆