Scrapy蜘蛛不工作 [英] Scrapy spider is not working

查看:36
本文介绍了Scrapy蜘蛛不工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于到目前为止没有任何工作,我开始了一个新项目

Since nothing so far is working I started a new project with

python scrapy-ctl.py startproject Nu

我完全按照教程操作,创建了文件夹和一个新蜘蛛

I followed the tutorial exactly, and created the folders, and a new spider

from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import HtmlXPathSelector
from scrapy.item import Item
from Nu.items import NuItem
from urls import u

class NuSpider(CrawlSpider):
    domain_name = "wcase"
    start_urls = ['http://www.whitecase.com/aabbas/']

    names = hxs.select('//td[@class="altRow"][1]/a/@href').re('/.a\w+')

    u = names.pop()

    rules = (Rule(SgmlLinkExtractor(allow=(u, )), callback='parse_item'),)

    def parse(self, response):
        self.log('Hi, this is an item page! %s' % response.url)

        hxs = HtmlXPathSelector(response)
        item = Item()
        item['school'] = hxs.select('//td[@class="mainColumnTDa"]').re('(?<=(JD,\s))(.*?)(\d+)')
        return item

SPIDER = NuSpider()

当我跑步时

C:\Python26\Scripts\Nu>python scrapy-ctl.py crawl wcase

我明白

[Nu] ERROR: Could not find spider for domain: wcase

其他蜘蛛至少能被 Scrapy 识别,这个不是.我做错了什么?

The other spiders at least are recognized by Scrapy, this one is not. What am I doing wrong?

感谢您的帮助!

推荐答案

请同时检查scrapy的版本.最新版本使用name"而不是domain_name"属性来唯一标识蜘蛛.

Please also check the version of scrapy. The latest version uses "name" instead of "domain_name" attribute to uniquely identify a spider.

这篇关于Scrapy蜘蛛不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆