从scrapy函数内向API发出请求 [英] making request to API from within scrapy function

查看：37 发布时间：2021/6/22 20:31:33 proxy scrapy

本文介绍了从scrapy函数内向API发出请求的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用scrapy.我想在每个请求的基础上轮换代理，并从我拥有的返回单个代理的 api 获取代理.我的计划是向api发出请求，得到一个代理，然后用它来设置基于:

I'm working with scrapy. I want to rotate proxies on a per request basis and get a proxy from an api I have that returns a single proxy. My plan is to make a request to the api, get a proxy, then use it to set the proxy based on :

http://stackoverflow.com/questions/4710483/scrapy-and-proxies

我会在哪里分配它:

request.meta['proxy'] = 'your.proxy.address';

我有以下几点:

class ContactSpider(Spider):
    name = "contact"

def parse(self, response):

    for i in range(1,3,1):
        PR= Request('htp//myproxyapi.com',  headers= self.headers)
        newrequest= Request('htp//sitetoscrape.com',  headers= self.headers)
        newrequest.meta['proxy'] = PR

但我不确定如何使用 Scrapy Request 对象来执行 api 调用.调试时我没有收到对 PR 请求的响应.我是否需要在单独的函数中执行此操作并使用 yield 语句，还是我的方法错误?

but I'm not sure how to use The Scrapy Request object to perform the api call. I'm Not getting a response to the PR request while debugging. Do I need to do this in a separate function and use a yield statement or is my approach wrong?

推荐答案

我需要在单独的函数中执行此操作并使用 yield 语句还是我的方法错误?

Do I need to do this in a separate function and use a yield statement or is my approach wrong?

是的.Scrapy 使用回调模型.您需要:

Yes. Scrapy uses a callback model. You would need to:

将PR 对象返回给scrapy 引擎.
解析 PR 的响应，并在其回调中产生 newrequest.

Yield the PR objects back to the scrapy engine.
Parse the response of PR, and in its callback, yield newrequest.

一个简单的例子:

def parse(self, response):
    for i in range(1,3,1):
        PR = Request(
            'http://myproxyapi.com', 
            headers=self.headers,
            meta={'newrequest': Request('htp//sitetoscrape.com',  headers=self.headers),},
            callback=self.parse_PR
        )
        yield PR

def parse_PR(self, response):
    newrequest = response.meta['newrequest']
    proxy_data = get_data_from_response(PR)
    newrequest.meta['proxy'] = proxy_data
    yield newrequest

另见:http://doc.scrapy.org/en/latest/topics/request-response.html#topics-request-response-ref-request-callback-arguments

这篇关于从scrapy函数内向API发出请求的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从scrapy函数内向API发出请求 [英] making request to API from within scrapy function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

从scrapy函数内向API发出请求 [英] making request to API from within scrapy function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭