如何在scrapy中获取队列中的请求数? [英] How to get the number of requests in queue in scrapy?

查看：105 发布时间：2021/7/16 21:59:24 python scrapy

本文介绍了如何在scrapy中获取队列中的请求数?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用 scrapy 来抓取一些网站.如何获取队列中的请求数?

I am using scrapy to crawl some websites. How to get the number of requests in the queue?

我查看了 scrapy 源代码并发现 scrapy.core.scheduler.Scheduler 可能会导致我的答案.请参阅:https://github.com/scrapy/scrapy/blob/0.24/scrapy/core/scheduler.py

I have looked at the scrapy source code and find scrapy.core.scheduler.Scheduler may lead to my answer. See: https://github.com/scrapy/scrapy/blob/0.24/scrapy/core/scheduler.py

两个问题:

如何访问我的蜘蛛类中的调度程序?
调度程序类中的 self.dqs 和 self.mqs 是什么意思?

How to access the scheduler in my spider class?
What does the self.dqs and self.mqs mean in the scheduler class?

推荐答案

这让我花了一段时间才弄明白，但这是我使用的:

This took me a while to figure out, but here's what I used:

self.crawler.engine.slot.scheduler

那是调度器的实例.然后，您可以调用它的 __len__() 方法，或者如果您只需要对挂起的请求进行 true/false，请执行以下操作:

That is the instance of the scheduler. You can then call the __len__() method of it, or if you just need true/false for pending requests, do something like this:

self.crawler.engine.scheduler_cls.has_pending_requests(self.crawler.engine.slot.scheduler)

请注意，即使队列为空，仍然可能有正在运行的请求.要检查当前有多少请求正在运行，请使用:

Beware that there could still be running requests even thought the queue is empty. To check how many requests are currently running use:

len(self.crawler.engine.slot.inprogress)

这篇关于如何在scrapy中获取队列中的请求数?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在scrapy中获取队列中的请求数? [英] How to get the number of requests in queue in scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在scrapy中获取队列中的请求数? [英] How to get the number of requests in queue in scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭