如何在logstash中使用sincedb? [英] how to use sincedb in logstash?

查看:934
本文介绍了如何在logstash中使用sincedb?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有成千上万的日志文件,它每天都被下载。我正在使用logstash和ElasticSearch进行解析,索引和搜索。

I have thousands of logs files and it gets downloaded everyday. I am using logstash and ElasticSearch for parsing, indexing and searching.

现在我正在使用文件输入插件来读取下载的文件并进行解析。我没有设置 sincedb_path ,所以它存储在 $ HOME 中。但问题是它读取日志文件只有一天。这是我输入的配置:

Now I am using file input plugin for reading downloaded files and parsing it. I have not set sincedb_path so its storing in $HOME. But the problem is it reads log files for just one day. Here is my configuration for input:

input {
  file {
    path => "/logs/downloads/apacheLogs/env1/**/*"
    type => "env1"
    exclude => "*.gz"
    start_position => "beginning"
  }
  file {
    path => "/logs/downloads/appLogs/env2/**/*"
    type => "env2"
    exclude => "*.gz"
    start_position => "beginning"
  }
}  


推荐答案

p>这显然是由文件处理程序中的一个错误引起的。

This is apparently caused by a bug in the File handler.

当File {}输入法读取日志文件时,处理的最后一个字节将被保存并定期复制转到 sincedb 文件。虽然您可以将文件设置为 / dev / null ,但如果需要,Logstash将在启动期间读取文件,并使用内存中的表格中的信息。

When File{} input method reads a log file, the last byte processed is being saved and periodically copied out to the sincedb file. While you can set the file to be /dev/null if you want, Logstash reads the file only during start up and uses the information from table in memory after.

问题是内存索引中的表索引位置由inode表示,即使检测到给定文件不再存在,也不会被修剪。如果你删除一个文件,然后添加一个新的文件,即使它有一个不同的名称 - 它可能具有相同的inode号,并且文件处理程序会认为它是相同的文件。

The problem is that the table in memory indexes position by inode, and is never pruned, even if it detects that a given file no longer exists. If you delete a file and then add a new one -- even if it has a different name -- it may well have the same inode number, and the File handler will think it is the same file.

如果新文件较大,则处理程序将仅从先前的最大字节开始读取并更新表。如果新文件较小,那么似乎认为文件被截断,可能会从默认位置重新开始处理。

If the new file is larger, then the handler will only read from the previous max byte onwards and update the table. If the new file is smaller, then it seems to think the file was somehow truncated, and may start processing again from the default position.

因此,唯一的方法处理事物是将 sincedb 设置为 / dev / null ,然后重新启动logstash(导致内部表丢失),然后所有与模式匹配的文件将从头开始重新读取 - 这也有问题,因为某些文件可能不是新的。

As a result, the only way to handle things is to set sincedb to be /dev/null, and then restart logstash (causing the internal table to be lost) and then all the files matching the pattern will be re-read from the beginning - and this has problems as well, since some of the files may not be new.

这篇关于如何在logstash中使用sincedb?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆