如果我使用celery作为任务调度程序,如何从python应用程序登录到splunk? [英] How can I log from my python application to splunk, if I use celery as my task scheduler?

查看:101
本文介绍了如果我使用celery作为任务调度程序,如何从python应用程序登录到splunk?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在服务器上运行了一个python脚本,该脚本应该由celery调度程序每天执行一次.我想直接将脚本中的日志发送给splunk.我正在尝试使用此 splunk_handler 库.如果我在没有芹菜的情况下在本地运行splunk_handler,那么它似乎可以工作.但是,如果我将其与芹菜一起使用,似乎没有到达splunk_handler的日志.控制台日志:

I have a python script running on a server, that should get executed once a day by the celery scheduler. I want to send my logs directly from the script to splunk. I am trying to use this splunk_handler library. If I run the splunk_handler without celery locally, it seems to work. But if I run it together with celery there seem to be no logs that reach the splunk_handler. Console-Log:

[SplunkHandler DEBUG]计时器线程已执行,但没有有效负载可发送

[SplunkHandler DEBUG] Timer thread executed but no payload was available to send

如何正确设置记录器,以便所有日志进入splunk_handler?

How do I set up the loggers correctly, so that all the logs go to the splunk_handler?

显然,celery设置了自己的记录器,并从python覆盖了根记录器.我尝试了几件事,包括连接celery的setup_logging信号以防止它覆盖记录器或在该信号中设置记录器.

Apparently, celery sets up its own loggers and overwrites the root-logger from python. I tried several things, including connecting the setup_logging signal from celery to prevent it to overwrite the loggers or setting up the logger in this signal.

import logging
import os

from splunk_handler import SplunkHandler

这是我在文件开头设置记录器的方式

This is how I set up the logger at the beginning of the file

logger = logging.getLogger(__name__)
splunk_handler = SplunkHandler(
host=os.getenv('SPLUNK_HTTP_COLLECTOR_URL'),
port=os.getenv('SPLUNK_HTTP_COLLECTOR_PORT'),
token=os.getenv('SPLUNK_TOKEN'),
index=os.getenv('SPLUNK_INDEX'),
debug=True)

splunk_handler.setFormatter(logging.BASIC_FORMAT)
splunk_handler.setLevel(os.getenv('LOGGING_LEVEL', 'DEBUG'))
logger.addHandler(splunk_handler)

芹菜初始化(不确定是否需要将worker_hijack_root_logger设置为False ...)

Celery initialisation (not sure, if worker_hijack_root_logger needs to be set to False...)

app = Celery('name_of_the_application', broker=CELERY_BROKER_URL)
app.conf.timezone = 'Europe/Berlin'
app.conf.update({
    'worker_hijack_root_logger': False,
})

在这里我连接到celery的setup_logging信号

Here I connect to the setup_logging signal from celery

@setup_logging.connect()
def config_loggers(*args, **kwags):
    pass
    # logger = logging.getLogger(__name__)
    # splunk_handler = SplunkHandler(
    #     host=os.getenv('SPLUNK_HTTP_COLLECTOR_URL'),
    #     port=os.getenv('SPLUNK_HTTP_COLLECTOR_PORT'),
    #     token=os.getenv('SPLUNK_TOKEN'),
    #     index=os.getenv('SPLUNK_INDEX'),
    #     debug=True)
    #
    # splunk_handler.setFormatter(logging.BASIC_FORMAT)
    # splunk_handler.setLevel(os.getenv('LOGGING_LEVEL', 'DEBUG'))
    # logger.addHandler(splunk_handler)

日志声明

logger.info("ARBITRARY LOG MESSAGE")

在splunk处理程序上激活调试(设置为True)时,splunk处理程序将注销,表明上面没有已发布的有效负载.有人知道我的代码有什么问题吗?

When activating debug on splunk handler (set to True), the splunk handler logs out that there is no payload available as already posted above. Does anybody have an idea what's wrong with my code?

推荐答案

经过数小时的弄清楚,我的代码最终可能出什么问题之后,我现在得到了令我满意的结果.首先,我创建了一个文件loggingsetup.py,在其中使用dictConfig配置了python记录器:

After hours of figuring out what eventually could be wrong with my code, I now have a result that satisfies me. First I created a file loggingsetup.py where I configured my python loggers with dictConfig:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': { # Sets up the format of the logging output
        'simple': {
            'format': '%(asctime)s - %(name)s - %(levelname)s - %(message)s',
             'datefmt': '%y %b %d, %H:%M:%S',
            },
        },
    'filters': {
        'filterForSplunk': { # custom loggingFilter, to not have Logs logged to Splunk that have the word celery in the name
            '()': 'loggingsetup.RemoveCeleryLogs', # class on top of this file
            'logsToSkip': 'celery' # word that it is filtered for
        },
    },
    'handlers': {
        'splunk': { # handler for splunk, level Warning. to not have many logs sent to splunk
            'level': 'WARNING',
            'class': 'splunk_logging_handler.SplunkLoggingHandler',
            'url': os.getenv('SPLUNK_HTTP_COLLECTOR_URL'),
            'splunk_key': os.getenv('SPLUNK_TOKEN'),
            'splunk_index': os.getenv('SPLUNK_INDEX'),
            'formatter': 'simple',
            'filters': ['filterForSplunk']
        },
        'console': { 
            'level': 'DEBUG',
            'class': 'logging.StreamHandler',
            'stream': 'ext://sys.stdout',
            'formatter': 'simple',
        },
    },
    'loggers': { # the logger, root is used
        '': {
            'handlers': ['console', 'splunk'],
            'level': 'DEBUG',
            'propagate': 'False', # does not give logs to other logers
        }
    }
}

对于日志记录过滤器,我必须创建一个从logging.Filter类继承的类.该类还依赖于文件loggingsetup.py

For the logging filter, I had to create a class that inherits from the logging.Filter class. The class also relies in file loggingsetup.py

class RemoveCeleryLogs(logging.Filter): # custom class to filter for celery logs (to not send them to Splunk)
    def __init__(self, logsToSkip=None):
        self.logsToSkip = logsToSkip

    def filter(self, record):
        if self.logsToSkip == None:
            allow = True
        else:
            allow = self.logsToSkip not in record.name
        return allow

之后,您可以像这样配置记录器:

After that, you can configer the loggers like this:

logging.config.dictConfig(loggingsetup.LOGGING)
logger = logging.getLogger('')

由于celery重定向了它的日志,并且日志翻了一倍,我不得不更新app.conf:

And because celery redirected it's logs and logs were doubled, I had to update the app.conf:

app.conf.update({
    'worker_hijack_root_logger': False, # so celery does not set up its loggers
    'worker_redirect_stdouts': False, # so celery does not redirect its logs
})

我面临的下一个问题是,我选择的Splunk_Logging库将某些内容与URL混合在一起.因此,我必须创建自己的splunk_handler类,该类继承自logging.Handler类.重要行如下:

The next Problem I was facing was, that my chosen Splunk_Logging library mixed something up with the url. So I had to create my own splunk_handler class that inherits from the logging.Handler class. The important lines here are the following:

auth_header = {'Authorization': 'Splunk {0}'.format(self.splunk_key)}
json_message = {"index": str(self.splunk_index), "event": data}
r = requests.post(self.url, headers=auth_header, json=json_message)

我希望我能为遇到python,splunk和celery日志记录类似问题的人提供帮助! :)

I hope that I can help someone with this answer who is facing similar problems with python, splunk and celery logging! :)

这篇关于如果我使用celery作为任务调度程序,如何从python应用程序登录到splunk?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆