Scrapy - 使用蜘蛛名称同时记录到文件和标准输出 [英] Scrapy - logging to file and stdout simultaneously, with spider names

查看：37 发布时间：2021/7/16 21:57:53 python web-crawler scrapy

本文介绍了Scrapy - 使用蜘蛛名称同时记录到文件和标准输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我决定使用 Python 日志记录模块，因为 Twisted on std error 生成的消息太长，并且我想要 INFO 级别的有意义的消息，例如由 生成的消息StatsCollector 将写入单独的日志文件，同时维护屏幕上的消息.

I've decided to use the Python logging module because the messages generated by Twisted on std error is too long, and I want to INFO level meaningful messages such as those generated by the StatsCollector to be written on a separate log file while maintaining the on screen messages.

 from twisted.python import log
     import logging
     logging.basicConfig(level=logging.INFO, filemode='w', filename='buyerlog.txt')
     observer = log.PythonLoggingObserver()
     observer.start()

好吧，这很好，我有我的消息，但缺点是我不知道消息是由哪个蜘蛛生成的！这是我的日志文件，twisted"由 %(name)s 显示:

Well, this is fine, I've got my messages, but the downside is that I do not know the messages are generated by which spider! This is my log file, with "twisted" being displayed by %(name)s:

 INFO:twisted:Log opened.
  2 INFO:twisted:Scrapy 0.12.0.2543 started (bot: property)
  3 INFO:twisted:scrapy.telnet.TelnetConsole starting on 6023
  4 INFO:twisted:scrapy.webservice.WebService starting on 6080
  5 INFO:twisted:Spider opened
  6 INFO:twisted:Spider opened
  7 INFO:twisted:Received SIGINT, shutting down gracefully. Send again to force unclean shutdown
  8 INFO:twisted:Closing spider (shutdown)
  9 INFO:twisted:Closing spider (shutdown)
 10 INFO:twisted:Dumping spider stats:
 11 {'downloader/exception_count': 3,
 12  'downloader/exception_type_count/scrapy.exceptions.IgnoreRequest': 3,
 13  'downloader/request_bytes': 9973,

与扭曲标准错误生成的消息相比:

As compared to the messages generated from twisted on standard error:

2011-12-16 17:34:56+0800 [expats] DEBUG: number of rules: 4
2011-12-16 17:34:56+0800 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2011-12-16 17:34:56+0800 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2011-12-16 17:34:56+0800 [iproperty] INFO: Spider opened
2011-12-16 17:34:56+0800 [iproperty] DEBUG: Redirecting (301) to <GET http://www.iproperty.com.sg/> from <GET http://iproperty.com.sg>
2011-12-16 17:34:57+0800 [iproperty] DEBUG: Crawled (200) <

我尝试过 %(name)s、%(module)s 等，但我似乎无法显示蜘蛛名称.有人知道答案吗?

I've tried %(name)s, %(module)s amongst others but I don't seem to be able to show the spider name. Does anyone knows the answer?

在设置中使用 LOG_FILE 和 LOG_LEVEL 的问题是在 std 错误中不会显示较低级别的消息.

the problem with using LOG_FILE and LOG_LEVEL in settings is that the lower level messages will not be shown on std error.

Scrapy - 使用蜘蛛名称同时记录到文件和标准输出 [英] Scrapy - logging to file and stdout simultaneously, with spider names

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Scrapy - 使用蜘蛛名称同时记录到文件和标准输出 [英] Scrapy - logging to file and stdout simultaneously, with spider names

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭