Python Shell 未运行 Scrapy [英] Python Shell not running Scrapy

查看:34
本文介绍了Python Shell 未运行 Scrapy的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Windows Vista 64 位上运行 Python.org 版本 2.7 64 位以使用 Scrapy.我有一些代码在通过 Command Shell 运行时可以正常工作(除了 Command Shell 无法识别非 Unicode 字符的一些问题),但是当我尝试通过 Python IDLE 运行脚本时,我收到以下错误消息:

I am running Python.org version 2.7 64 bit on Windows Vista 64 bit to use Scrapy. I have some code that is working when I run it via Command Shell (apart from some issues with Command Shell not recognising non Unicode characters), however when I try running the script via the Python IDLE i get the following error message:

Warning (from warnings module):
  File "C:\Python27\mrscrap\mrscrap\spiders\test.py", line 24
    class MySpider(BaseSpider):
ScrapyDeprecationWarning: __main__.MySpider inherits from deprecated class scrapy.spider.BaseSpider, please inherit from scrapy.spider.Spider. (warning only on first subclass, there may be others)

用于生成此错误的代码是:

The code used to generate this error is:

from scrapy.spider import BaseSpider
from scrapy.selector import Selector
from scrapy.utils.markup import remove_tags
import re

class MySpider(BaseSpider):
    name = "wiki"
    allowed_domains = ["wikipedia.org"]
    start_urls = ["http://en.wikipedia.org/wiki/Asia"]

    def parse(self, response):
        titles = response.selector.xpath("normalize-space(//title)")
        for titles in titles:

            body = response.xpath("//p").extract()
            body2 = "".join(body)
            print remove_tags(body2)

首先,在 Command Shell 中运行正常时出现此错误的原因是什么?其次,当我按照错误中的说明将代码中 BaseSpider 的两个实例替换为Spider"时,代码在 Python shell 中运行,但什么也不做.没有错误,没有打印到日志中,没有错误或警告,什么都没有.

Firstly, what is the cause of this error when it works fine in Command Shell? Secondly, when I follow the instructions in the error and replace both instances of BaseSpider within the code with just 'Spider' the code runs in Python shell, but does nothing. No error, nothing printed to the log, no errors or warnings, nothing.

谁能告诉我为什么这个修订版的代码没有将它的输出打印到 Python IDLE?

Can anyone tell me why this revised version of the code is not printing it's output to the Python IDLE?

谢谢

推荐答案

from scrapy.cmdline import execute 添加到您的导入

Add from scrapy.cmdline import execute to your imports

然后输入 execute(['scrapy','crawl','wiki']) 并运行您的脚本.

Then put execute(['scrapy','crawl','wiki']) and run your script.

from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.utils.markup import remove_tags
import re
from scrapy.cmdline import execute
class MySpider(Spider):
    name = "wiki"
    allowed_domains = ["wikipedia.org"]
    start_urls = ["http://en.wikipedia.org/wiki/Asia"]

    def parse(self, response):
        titles = response.selector.xpath("normalize-space(//title)")
        for title in titles:

            body = response.xpath("//p").extract()
            body2 = "".join(body)
            print remove_tags(body2)

execute(['scrapy','crawl','wiki'])

这篇关于Python Shell 未运行 Scrapy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆