FileBeat 收集问题 [英] FileBeat harvesting issues

查看:33
本文介绍了FileBeat 收集问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用 ELK 来控制我们的程序日志.在我们的 FileBeat 配置中,我们从 30 个不同的 路径中收集,这些路径包含每秒更新的文件(它仅在 prod 的机器中每秒更新 - 在其他开发机器中,我们的日志要少得多).我们的日志文件不会被删除,直到它们变老并且我们停止使用它们(我们也不会修改那里的名称).最近我们发现配置文件 (.yml) 中最后路径的日志来自 prod 机器 从未出现在 Kibana 中.

We are using ELK for controlling our program logs. In our FileBeat config we are harvesting from 30 different paths which contains files that updates every second (it updates every second only in the prod's machines - in the other Dev machines we have significantly less logs). Our log files not get deleted until they getting old and we stop using them (also we don't modify there names) . Lately we found out that the logs from last paths in the configuration file (.yml) from the prod machines is never appearing in the Kibana.

经过调查,我们意识到 FileBeat 卡在文件上的是第一个路径,似乎永远不会到达最后一个路径.当我将最后两条路径的位置替换为开头时,FileBeat 开始在那里注册所有日志,然后在收获它们.

After investigation, we realized that FileBeat stuck on the files is the first's path's and never seem to reach the last one's. When I replace the location of the last two paths to the beginning, FileBeat started to register all the logs there and later on harvest them.

我查看了有关 FileBeat 配置的文档,看到了 close* 选项 close_option_config 这似乎是个好主意.但是我还没有做到正确,我不确定 scan_frequency 选项的推荐时间是多少(目前默认为 10 秒)以及什么对我最有用.

I looked up in the documentation on the FileBeat configuration and I saw the close* options close_option_config which seem like a good idea. But I didn't managed to get it right yet and I don't sure what is the recommended time for the scan_frequency option (that for now is default of 10s) and what would serve me in the best way.

我尝试将 close_timeout 更改为 15s 并将 scan_frequency 更改为 2m

I tried to change the close_timeout to 15s and the scan_frequency to 2m

      close_timeout: 15s
      scan_frequency: 2m

我想在这里发表一些意见,我可以做些什么来解决这个问题?我把配置放在这里有一些参考,看看我是否遗漏了其他东西.

I would like to here some opinion what can I do to get solve this problem? I put the config here to have some reference and to see if I missed something else.

我的 filebeat.yml:(更改前)

my filebeat.yml: (before changes)

      filebeat:
  # List of prospectors to fetch data.
  prospectors:
    # Each - is a prospector. Below are the prospector specific configurations
    -
      paths:
        - D:logs*path1a_*_Pri_app.log.txt
      input_type: log
      document_type: type1
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after
    -
      paths:
        - D:logs*path2_*_Paths_app.log.txt
      input_type: log
      document_type: type2
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after
    -
      paths:
        - D:logs*path3c_*_R_app.log.txt
      input_type: log
      document_type: path3
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after
    -
      paths:
        - D:logs*path4d_*_d_app.log.txt
        - C:logs*path4d_*_d_app.log.txt
      input_type: log
      document_type: path4
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after

.....同上

 paths:
        - D:logs*path27S.Coordinator_Z.*.log*
        - C:logs*path27S.Coordinator_Z*.log*
      input_type: log
      document_type: path27
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after
    -
      paths:
        - D:logs*path28d_*_Tr_app.log.txt
        - C:logs*path28d_*_Tr_app.log.txt
      input_type: log
      document_type: path28
      multiline.pattern: '^[0-9]{4}-[0-9]{2}-[0-9]{2}'
      multiline.negate: true
      multiline.match: after
    -
      paths:
        - D:logs*R1_OutputR*pid_*_rr_*
      input_type: log
      document_type: path29
      multiline.pattern: '<?xml version="1.0" encoding="UTF-8"?>'
      multiline.negate: true
      multiline.match: after  
    -
      paths:
        - D:logs*R2_OutputR*pid_*_rr_*
      input_type: log
      document_type: path30
      multiline.pattern: '<?xml version="1.0" encoding="UTF-8"?>'
      multiline.negate: true
      multiline.match: after

      registry_file: "C:/ProgramData/filebeat/registry"

推荐答案

经过长时间的调查,当我试图找到一个与 solution 类似的问题,并尝试了我的运气之后讨论弹性论坛.我设法解决了这个问题.

After a long investigation when i tried to find a similar problem to what i had with a solution, and after trying my luck in the dicuss elastic forum. I managed to solve this issue.

由于我在网上没有看到这个选项,所以我把它放在这里.

Since I didn't see this option in the web i am putting it here.

Filebeat 收集系统在同时处理大量打开的文件时显然有其限制.(一个已知的问题和弹性团队还提供了一堆配置选项来帮助处理这个问题并根据您的需要为 ELK 设计服装,例如 config_options).我设法通过打开另外 2 个 Filebeat 服务解决了我的问题,我通过以下方式配置它们的探矿者(A 的示例同样适用于 B):

Filebeat harvesting system apparently has it limit when it comes with dealing with a big scale number of open files in the same time. (a known problem and elastic team also provides bunch of config options to help dealing this issue and costume ELK to your need, e.g config_options). I managed to solve my problem with opening 2 more Filebeat services which i configures their prospectors in the following way(an example of A same goes to B):

paths:
    - D:logs*pid_*_rr_*
  input_type: log
  document_type: A 
  multiline.pattern: '<?xml version="1.0" encoding="UTF-8"?>'
  multiline.negate: true
  multiline.match: after
  close_eof: true

以这种方式,因为相互依赖的服务 Filebeat 一直在尝试操作它们(而不是卡在"第一个探矿者身上).

In this way, because the services working interdependently Filebeat keep trying to operate them (and not "stuck" on the first prospectors).

我通过这种方式使我的收割能力翻倍.

I manage in this way to double my harvesting abilities.

在 Elastic 网站上提出讨论:讨论

posing up the discussion in Elastic website: the discussion

这篇关于FileBeat 收集问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆