python - Scrapy 1.1.2 怎么用?

查看:189
本文介绍了python - Scrapy 1.1.2 怎么用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问 题

Scrapy 1.1.2 在 python3.4.4 安装成功。
并用了 Scrapy bench 作测试:

C:\Documents and Settings\Administrator>scrapy bench
2016-09-02 18:06:42 [scrapy] INFO: Scrapy 1.1.2 started (bot: scrapybot)
2016-09-02 18:06:42 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 1, 'LOG_LEVEL': 'INFO', 'CLOSESPIDER_TIMEOUT': 10}
2016-09-02 18:06:44 [scrapy] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
 'scrapy.extensions.telnet.TelnetConsole',
 'scrapy.extensions.logstats.LogStats',
 'scrapy.extensions.closespider.CloseSpider']
2016-09-02 18:06:45 [scrapy] INFO: Enabled downloader middlewares:
['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
 'scrapy.downloadermiddlewares.retry.RetryMiddleware',
 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
 'scrapy.downloadermiddlewares.chunked.ChunkedTransferMiddleware',
 'scrapy.downloadermiddlewares.stats.DownloaderStats']
2016-09-02 18:06:45 [scrapy] INFO: Enabled spider middlewares:
['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
 'scrapy.spidermiddlewares.referer.RefererMiddleware',
 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
 'scrapy.spidermiddlewares.depth.DepthMiddleware']
2016-09-02 18:06:45 [scrapy] INFO: Enabled item pipelines:
[]
2016-09-02 18:06:45 [scrapy] INFO: Spider opened
2016-09-02 18:06:45 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:46 [scrapy] INFO: Crawled 1 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:47 [scrapy] INFO: Crawled 2 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:49 [scrapy] INFO: Crawled 3 pages (at 60 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:49 [scrapy] INFO: Crawled 10 pages (at 420 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:52 [scrapy] INFO: Crawled 23 pages (at 780 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:54 [scrapy] INFO: Crawled 31 pages (at 480 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:55 [scrapy] INFO: Closing spider (closespider_timeout)
2016-09-02 18:06:55 [scrapy] INFO: Crawled 39 pages (at 480 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:56 [scrapy] INFO: Crawled 50 pages (at 660 pages/min), scraped 0 items (at 0 items/min)
2016-09-02 18:06:57 [scrapy] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 15412,
 'downloader/request_count': 50,
 'downloader/request_method_count/GET': 50,
 'downloader/response_bytes': 87156,
 'downloader/response_count': 50,
 'downloader/response_status_count/200': 50,
 'finish_reason': 'closespider_timeout',
 'finish_time': datetime.datetime(2016, 9, 2, 10, 6, 57, 218750),
 'log_count/INFO': 15,
 'request_depth_max': 4,
 'response_received_count': 50,
 'scheduler/dequeued': 50,
 'scheduler/dequeued/memory': 50,
 'scheduler/enqueued': 1001,
 'scheduler/enqueued/memory': 1001,
 'start_time': datetime.datetime(2016, 9, 2, 10, 6, 45, 609375)}
2016-09-02 18:06:57 [scrapy] INFO: Spider closed (closespider_timeout)

从反馈的信息上看,是成功的。
然后,我按照这个帖子做例子:scrapy简单学习
却提示出错了,如下:

C:\Documents and Settings\Administrator>scrapy crawl dmoz -o items.json
Scrapy 1.1.2 - no active project

Unknown command: crawl

Use "scrapy" to see available commands


有哪位知道具体怎么运用Scrapy 1.1.2 吗?

解决方案

scrapy 运行你的爬虫项目时是需要进入项目目录下运行命令的,不然它不解析你的命令,比如说我的项目Spiders,那就cd Spiders,然后执行命令

这篇关于python - Scrapy 1.1.2 怎么用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆