CloudWatch 日志表现得很奇怪 [英] CloudWatch logs acting weird

查看:37
本文介绍了CloudWatch 日志表现得很奇怪的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个带有多行日志语句的日志文件.它们在每个日志语句的开头都具有相同的日期时间格式.配置如下:

I have two log files with multi-line log statements. Both of them have same datetime format at the begining of each log statement. The configuration looks like this:

state_file = /var/lib/awslogs/agent-state

[/opt/logdir/log1.0]
datetime_format = %Y-%m-%d %H:%M:%S
file = /opt/logdir/log1.0
log_stream_name = /opt/logdir/logs/log1.0
initial_position = start_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = my.log.group


[/opt/logdir/log2-console.log]
datetime_format = %Y-%m-%d %H:%M:%S
file = /opt/logdir/log2-console.log
log_stream_name = /opt/logdir/log2-console.log
initial_position = start_of_file
multi_line_start_pattern = {datetime_format}
log_group_name = my.log.group

cloudwatch 日志代理正在将 log1.0 日志正确发送到我在 cloudwatch 上的日志组,但是,它没有发送 log2-console.log 的日志文件.

The cloudwatch logs agent is sending log1.0 logs correctly to my log group on cloudwatch, however, its not sending log files for log2-console.log.

awslogs.log 说:

awslogs.log says:

2016-11-15 08:11:41,308 - cwlogs.push.batch - WARNING - 3593 - Thread-4 - Skip event: {'timestamp': 1479196444000, 'start_position': 42330916L, 'end_position': 42331504L}, reason: timestamp is more than 2 hours in future.
2016-11-15 08:11:41,308 - cwlogs.push.batch - WARNING - 3593 - Thread-4 - Skip event: {'timestamp': 1479196451000, 'start_position': 42331504L, 'end_position': 42332092L}, reason: timestamp is more than 2 hours in future.

虽然服务器时间是正确的.同样奇怪的是 start_position 和 end_position 中提到的行号在推送的实际日志文件中不存在.

Though server time is correct. Also weird thing is Line numbers mentioned in start_position and end_position does not exist in actual log file being pushed.

还有其他人遇到过这个问题吗?

Anyone else experiencing this issue?

推荐答案

我能够解决这个问题.

awslogs 的状态被破坏.状态存储在/var/awslogs/state/agent-state 中的 sqlite 数据库中.您可以通过

The state of awslogs was broken. The state is stored in a sqlite database in /var/awslogs/state/agent-state. You can access it via

sudo sqlite3 /var/awslogs/state/agent-state

需要 sudo 才能拥有写权限.

sudo is needed to have write access.

列出所有流

select * from stream_state;

查看您的日志流并注意 source_id,它是 v 列中 json 数据结构的一部分.

Look up your log stream and note the source_id which is part of a json data structure in the v column.

然后,在 push_state 表中列出具有此 source_id 的所有记录(在我的例子中是 7675f84405fcb8fe5b6bb14eaa0c4bfd)

Then, list all records with this source_id (in my case it was 7675f84405fcb8fe5b6bb14eaa0c4bfd) in the push_state table

select * from push_state where k="7675f84405fcb8fe5b6bb14eaa0c4bfd";

结果记录在 v 列中有一个 json 数据结构,其中包含一个 batch_timestamp.而这个 batch_timestamp 接缝是错误的.过去了,不再处理任何更新的(超过 2 小时)日志条目.

The resulting record has a json data structure in the v column which contains a batch_timestamp. And this batch_timestamp seams to be wrong. It was in the past and any newer (more than 2 hours) log entries were not processed anymore.

解决办法是更新这条记录.复制 v 列,用当前时间戳替换 batch_timestamp 并用类似的内容更新

The solution is to update this record. Copy the v column, replace the batch_timestamp with the current timestamp and update with something like

update push_state set v='... insert new value here ...' where k='7675f84405fcb8fe5b6bb14eaa0c4bfd';

重启服务

sudo /etc/init.d/awslogs restart

希望对你有用!

这篇关于CloudWatch 日志表现得很奇怪的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆