如何使用带有自定义日志处理程序的 scrapy.log 模块? [英] How to use scrapy.log module with custom log handler?

查看:37
本文介绍了如何使用带有自定义日志处理程序的 scrapy.log 模块?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在从事一个 Scrapy 项目,到目前为止一切都运行良好.但是,我对 Scrapy 的日志配置可能性并不满意.目前,我已经在我的项目的 settings.py 中设置了 LOG_FILE = 'my_spider.log'.当我在命令行上执行 scrapy crawl my_spider 时,它会为整个抓取过程创建一个大日志文件.这对我来说是不可行的.

I have been working on a Scrapy project and so far everything works quite well. However, I'm not satisfied with Scrapy's logging configuration possibilities. At the moment, I have set LOG_FILE = 'my_spider.log' in the settings.py of my project. When I execute scrapy crawl my_spider on the command line, it creates one big log file for the entire crawling process. This is not feasible for my purposes.

如何将 Python 的自定义日志处理程序与 scrapy.log 模块?特别是,我想利用 Python 的 logging.handlers.RotatingFileHandler 这样我就可以将日志数据拆分成几个小文件,而不必处理一个大文件.不幸的是,Scrapy 日志工具的文档不是很广泛.提前谢谢了!

How can I use Python's custom log handlers in combination with the scrapy.log module? Especially, I want to make use of Python's logging.handlers.RotatingFileHandler so that I can split the log data into several small files instead of having to deal with one huge file. The documentation of Scrapy's logging facility is not very extensive, unfortunately. Many thanks in advance!

推荐答案

您可以通过首先在 scrapy.utils.log.configure_logging 中禁用 root 句柄,然后添加您自己的日志处理程序,将所有 scrapy 日志记录到文件中.

you can log all scrapy logs to file by first disabling root handle in scrapy.utils.log.configure_logging and then adding your own log handler.

在scrapy项目的settings.py文件中添加如下代码:

In settings.py file of scrapy project add the following code:

import logging
from logging.handlers import RotatingFileHandler

from scrapy.utils.log import configure_logging

LOG_ENABLED = False
# Disable default Scrapy log settings.
configure_logging(install_root_handler=False)

# Define your logging settings.
log_file = '/tmp/logs/CRAWLER_logs.log'

root_logger = logging.getLogger()
root_logger.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
rotating_file_log = RotatingFileHandler(log_file, maxBytes=10485760, backupCount=1)
rotating_file_log.setLevel(logging.DEBUG)
rotating_file_log.setFormatter(formatter)
root_logger.addHandler(rotating_file_log)

我们还根据需要自定义日志级别(DEBUG 到 INFO)和格式化程序.要在您的蜘蛛、管道中添加自定义日志,我们可以像普通的 Python 日志记录一样轻松完成,如下所示:

Also we customize log level (DEBUG to INFO) and formatter as required. To add custom logs inside you spider, pipeline we can easily do it like a normal python logging as follows:

管道内部.py

import logging
logger = logging.getLogger()
logger.info('processing item')

希望这有帮助!

这篇关于如何使用带有自定义日志处理程序的 scrapy.log 模块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆