Scrapyd-Deploy:找不到 SPIDER_MODULES [英] Scrapyd-Deploy: SPIDER_MODULES not found

查看:139
本文介绍了Scrapyd-Deploy:找不到 SPIDER_MODULES的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用scrapy-deploy 1.2 部署scrapy 2.1.0 项目并收到此错误:

I am trying to deploy a scrapy 2.1.0 project with scrapy-deploy 1.2 and get this error:

scrapyd-deploy example
/Library/Frameworks/Python.framework/Versions/3.8/bin/scrapyd-deploy:23: ScrapyDeprecationWarning: Module `scrapy.utils.http` is deprecated, Please import from `w3lib.http` instead.
  from scrapy.utils.http import basic_auth_header
fatal: No names found, cannot describe anything.
Packing version r1-master
Deploying to project "crawler" in http://myip:6843/addversion.json
Server response (200):
{"node_name": "spider1", "status": "error", "message": "/usr/local/lib/python3.8/dist-packages/scrapy/utils/project.py:90: ScrapyDeprecationWarning: Use of environment variables prefixed with SCRAPY_ to override settings is deprecated. The following environment variables are currently defined: EGG_VERSION\n  warnings.warn(\nTraceback (most recent call last):\n  File \"/usr/lib/python3.8/runpy.py\", line 193, in _run_module_as_main\n    return _run_code(code, main_globals, None,\n  File \"/usr/lib/python3.8/runpy.py\", line 86, in _run_code\n    exec(code, run_globals)\n  File \"/usr/local/lib/python3.8/dist-packages/scrapyd/runner.py\", line 40, in <module>\n    main()\n  File \"/usr/local/lib/python3.8/dist-packages/scrapyd/runner.py\", line 37, in main\n    execute()\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/cmdline.py\", line 142, in execute\n    cmd.crawler_process = CrawlerProcess(settings)\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/crawler.py\", line 280, in __init__\n    super(CrawlerProcess, self).__init__(settings)\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/crawler.py\", line 152, in __init__\n    self.spider_loader = self._get_spider_loader(settings)\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/crawler.py\", line 146, in _get_spider_loader\n    return loader_cls.from_settings(settings.frozencopy())\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/spiderloader.py\", line 60, in from_settings\n    return cls(settings)\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/spiderloader.py\", line 24, in __init__\n    self._load_all_spiders()\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/spiderloader.py\", line 46, in _load_all_spiders\n    for module in walk_modules(name):\n  File \"/usr/local/lib/python3.8/dist-packages/scrapy/utils/misc.py\", line 69, in walk_modules\n    mod = import_module(path)\n  File \"/usr/lib/python3.8/importlib/__init__.py\", line 127, in import_module\n    return _bootstrap._gcd_import(name[level:], package, level)\n  File \"<frozen importlib._bootstrap>\", line 1014, in _gcd_import\n  File \"<frozen importlib._bootstrap>\", line 991, in _find_and_load\n  File \"<frozen importlib._bootstrap>\", line 973, in _find_and_load_unlocked\nModuleNotFoundError: No module named 'crawler.spiders_prod'\n"}

crawler.spider_prod 是 SPIDER_MODULES 中定义的第一个模块

crawler.spiders_prod is the first module defined in SPIDER_MODULES

crawler.settings.py 的一部分:

Part of crawler.settings.py:

SPIDER_MODULES = ['crawler.spiders_prod', 'crawler.spiders_dev']
NEWSPIDER_MODULE = 'crawler.spiders_dev'

爬虫在本地工作,但使用部署它将无法使用我称之为蜘蛛所在文件夹的任何内容.

The crawler works localy, but using deploy it will fail to use whatever I call the folder where my spiders live in.

scrapyd-deploy setup.py:

scrapyd-deploy setup.py:

# Automatically created by: scrapyd-deploy

from setuptools import setup, find_packages

setup(
    name         = 'project',
    version      = '1.0',
    packages     = find_packages(),
    entry_points = {'scrapy': ['settings = crawler.settings']},
)

scrapy.cfg:

scrapy.cfg:

[deploy:example]
url = http://myip:6843/
username = test
password = whatever.
project = crawler
version = GIT

这可能是一个错误还是我遗漏了什么?

Is this possibly a bug or am I missing something?

推荐答案

模块必须在 scrapy 中初始化.这通过简单地将以下文件放入定义为模块的每个文件夹中来实现:

Modules have to be initialised within scrapy. This happens through simply placing the following file into each folder defined as a module:

__init__.py

这解决了我描述的问题.

This has solved my described problem.

学习:

如果你想把你的蜘蛛拆分成文件夹,简单的创建一个文件夹并在设置文件中指定这个文件夹作为一个模块是不够的,你还需要把这个文件放到新的文件夹中.爬虫工作很有趣,没有文件只是部署到scrapyd失败.

If you want to split your spiders into folders, it is not enough to simple create a folder and specify this folder as a module within the settings file, but you also need to place this file into the new folder. Funny engough the crawler works, without the file just deployment to scrapyd fails.

这篇关于Scrapyd-Deploy:找不到 SPIDER_MODULES的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆