如何使用 Scrapy 打开文件流进行读取? [英] How do you open a file stream for reading using Scrapy?

查看：49 发布时间：2021/7/16 22:12:10 python scrapy scrapy-spider

本文介绍了如何使用 Scrapy 打开文件流进行读取?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用 Scrapy，我想使用我提取的 url 将二进制文件读入内存并提取内容.

Using Scrapy, I want to use my extracted url to read a binary file into memory and extract the contents.

目前，我可以使用选择器在页面上找到 URL，例如

Currently, I can find the URL on the page using a selector e.g.

myFile = response.xpath('//a[contains(@href,".interestingfileextension")]/@href').extract()

然后我如何将该文件读入内存以便我可以在该文件中查找内容?

How do I then read that file into memory so that I can look for content in that file?

非常感谢

推荐答案

提出请求并探索回调中的内容:

Make a request and explore the content in the callback:

def parse(self, response):
    url = response.xpath('//a[contains(@href,".interestingfileextension")]/@href').extract_first()
    return scrapy.Request(url, callback=self.parse_file)

def parse_file(self, response):
    # response here is the contents of the file
    print(response.body)

这篇关于如何使用 Scrapy 打开文件流进行读取?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 Scrapy 打开文件流进行读取? [英] How do you open a file stream for reading using Scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何使用 Scrapy 打开文件流进行读取? [英] How do you open a file stream for reading using Scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭