测试Scrapy Spider仍然有效-查找页面更改 [英] Test scrapy spider still working - find page changes

查看：109 发布时间：2020/9/14 22:39:44 unit-testing scrapy automated-tests scrapy-spider

本文介绍了测试Scrapy Spider仍然有效-查找页面更改的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何针对在线数据测试抓痒的蜘蛛.

How can I test a scrapy spider against online data.

我现在从此帖子中得知，可以针对 offline 数据.

I now from this post that it is possible to test a spider against offline data.

我的目标是检查我的Spider是否仍从页面中提取正确的数据，或者页面是否已更改.我通过XPath提取数据，有时页面会接收并更新，而我的抓取工具不再起作用.我希望测试尽可能接近我的代码，例如.使用Spider和scrapy设置，然后直接插入parse方法.

My target is to check if my spider still extracts the right data from a page, or if the page changed. I extract the data via XPath and sometimes the page receives and update and my scraper is no longer working. I would love to have the test as close to my code as possible, eg. using the spider and scrapy setup and just hook into the parse method.

推荐答案

参考您提供的链接，您可以尝试这种在线测试方法，该方法用于解决与您的问题类似的问题.您所要做的就是不要从文件中读取请求，而可以使用请求库为您获取实时网页，并根据您从以下请求中获得的响应来撰写抓抓的响应

Referring to the link you provided, you could try this method for online testing which I used for my problem which was similar to yours. All you have to do is instead of reading the requests from a file you can use the Requests library to fetch the live webpage for you and compose a scrapy response from the response you get from Requests like below

import os
import requests

from scrapy.http import Response, Request

def online_response_from_url (url=None):

    if not url:
        url = 'http://www.example.com'

    request = Request(url=url)


    oresp = requests.get(url)

    response = TextResponse(url=url, request=request,
    body=oresp.text, encoding = 'utf-8')

    return response

这篇关于测试Scrapy Spider仍然有效-查找页面更改的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

测试Scrapy Spider仍然有效-查找页面更改 [英] Test scrapy spider still working - find page changes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

测试Scrapy Spider仍然有效-查找页面更改 [英] Test scrapy spider still working - find page changes

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭