scrapy,如何向表单发送多个请求 [英] scrapy, how to send multiple requests to a form

查看:327
本文介绍了scrapy,如何向表单发送多个请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,我在这里有一个令人讨厌的代码,我向一个表单发送了一个请求,我收回了我需要的所有数据。代码:

Ok I have a wroking code here, I am sending 1 request to a form and I am getting back all data that I need. Code:

 def start_requests(self):
    nubmers="12345"
    submitForm = FormRequest("https://domain.com/url",
                             formdata={'address':numbers,'submit':'Search'},
                             callback=self.after_submit)
    return [submitForm]

现在我需要通过相同的表单发送多个请求并为每个请求收集数据。我需要收集x号码的数据。我将所有数字存储在一个文件中:

Now I need to send multiple request through the same form and collect data for each request. I need to collect data for x numbers. I stored all numbers into a file:

   12345
   54644
   32145
   12345

code:

def start_requests(self):
    with open('C:\spiders\usps\zips.csv') as fp:
        for line in fp:
            submitForm = FormRequest("https://domain.com/url",
                                formdata={'address':line,
    'submit':'Search'},callback=self.after_submit,dont_filter=True)
    return [submitForm]

这段代码也可以工作,但它收集文件中最后一个条目的数据只要。我需要为文件中的每一行/数字收集数据。如果我尝试yield而不是返回scrapy停止并发出此错误:

This code works also but it collects data for last entry in file only. I need to collect data for each row/number in file. If I try yield instead return scrapy stops and gives out this error:

if not request.dont_filter and self.df.request_seen(request):
exceptions.AttributeError: 'list' object has no attribute 'dont_filter'


推荐答案

首先,您肯定需要 yield 来激发多个请求:

First of all, you definitely need yield to "fire" up multiple requests:

def start_requests(self):
    with open('C:\spiders\usps\zips.csv') as fp:
        for line in fp:
            yield FormRequest("https://domain.com/url",
                              formdata={'address':line, 'submit':'Search'},
                              callback=self.after_submit,
                              dont_filter=True)

另外,你不应该把 FormRequest 放入列表中,只是产生请求。

Also, you shouldn't enclose the FormRequest into a list, just yield the request.

这篇关于scrapy,如何向表单发送多个请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆