Scrapy 在 AWS Lambda 上运行时抛出错误 ReactorNotRestartable [英] Scrapy throws error ReactorNotRestartable when runnning on AWS Lambda

查看：22 发布时间：2021/12/6 12:48:50 amazon-web-services lambda scrapy twisted aws-lambda

本文介绍了Scrapy 在 AWS Lambda 上运行时抛出错误 ReactorNotRestartable的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经部署了一个scrapy项目，它会在收到 lambda api 请求时进行抓取.

I have deployed a scrapy project which crawls whenever an lambda api requests comes.

它在第一次调用 api 时运行良好，但后来失败并抛出 ReactorNotRestartable 错误.

It runs perfectly for the first api call but later on it fails and throws ReactorNotRestartable error.

据我所知，AWS Lambda 生态系统并没有终止进程，因此反应堆仍然存在于内存中.

As far as I can understand the AWS Lambda ecosystem is not killing the process, hence reactor is still present in the memory.

lambda 日志错误如下:

The lambda log error is as follows:

Traceback (most recent call last):
File "/var/task/aws-lambda.py", line 42, in run_company_details_scrapy
process.start()
File "./lib/scrapy/crawler.py", line 280, in start
reactor.run(installSignalHandlers=False)  # blocking call
File "./lib/twisted/internet/base.py", line 1242, in run
self.startRunning(installSignalHandlers=installSignalHandlers)
File "./lib/twisted/internet/base.py", line 1222, in startRunning
ReactorBase.startRunning(self)
File "./lib/twisted/internet/base.py", line 730, in startRunning
raise error.ReactorNotRestartable()
ReactorNotRestartable

lambda 处理函数是:

The lambda handler function is:

def run_company_details_scrapy(event, context):
   process = CrawlerProcess()
   process.crawl(CompanyDetailsSpidySpider)
   process.start()

我有一个解决方法，即不通过在启动函数中插入标志来停止反应器

I had a workaround by not stopping the reactor by inserting a flag in the start function

process.start(stop_after_crawl=False)

但问题在于我必须等到 lambda 调用超时.

But the problem with this was that I had to wait until the lambda call timed out.

尝试了其他解决方案，但似乎都不起作用.谁能指导我如何解决这个问题.

Tried other solutions, but none of them seems to work.Can anyone guide me how to solve this problem.

Scrapy 在 AWS Lambda 上运行时抛出错误 ReactorNotRestartable [英] Scrapy throws error ReactorNotRestartable when runnning on AWS Lambda

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Scrapy 在 AWS Lambda 上运行时抛出错误 ReactorNotRestartable [英] Scrapy throws error ReactorNotRestartable when runnning on AWS Lambda

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭