Kubernetes事件的时间表 [英] Timeline of kubernetes events
问题描述
我希望能够在时间轴上看到发生在kube集群上的所有各种事件,包括何时发现节点已死,何时添加新节点,何时pod崩溃以及何时重新启动.
I would like to be able to see all of the various things that happened to a kube cluster on a timeline, including when nodes were found to be dead, when new nodes were added, when pods crashed and when they were restarted.
到目前为止,我们发现的最佳结果是kubectl get event
,但这似乎有一些局限性:
So far the best that we have found is kubectl get event
but that seems to have a few limitations:
- 它没有回到过去那么远(我不确定它回溯到多远.有一天吗?)
- 它组合了类似的事件,并按每个组中最新事件的时间对结果列表进行排序.由于该范围内的事件可能已与该范围外的后续事件合并,因此无法知道在某个时间范围内发生了什么.
我的一个想法是编写一个Pod,该Pod将使用API来监视事件流并将它们记录到文件中.这样一来,我们就可以控制保留时间,而且看来在观看时发生的事件也不会合并在一起,从而解决了第二个问题.
One idea that I have is to write a pod that will use the API to watch the stream of events and log them to a file. This would let us control retention and it seems that events that occur while we are watching will not be combined, solving the second problem as well.
其他人在做什么呢?
推荐答案
-
我的理解是,Kubernetes本身具有dedups事件,在此处记录如下: https://github.com/kubernetes/kubernetes/blob/master/docs/design/event_compression.md 一旦发生这种情况,就无法恢复单个事件.
My understanding is that Kubernetes itself dedups events, documented here: https://github.com/kubernetes/kubernetes/blob/master/docs/design/event_compression.md Once that happens, there is no way to get the individual events back.
请参见 https://github.com/kubernetes/kubernetes/issues/36304投诉如何丢失信息. https://github.com/kubernetes/kubernetes/pull/46034 至少有所改进消息.另请参见 https://github.com/kubernetes/enhancements/pull/1291 KEP用于提高kubectl可用性的最新讨论和建议.
See https://github.com/kubernetes/kubernetes/issues/36304 for complaints how that loses info. https://github.com/kubernetes/kubernetes/pull/46034 at least improved the message. See also https://github.com/kubernetes/enhancements/pull/1291 KEP for recent discussion and proposal to improve usability in kubectl.
事件将保留多长时间?他们的生存时间"显然是由kube-apiserver
--event-ttl
选项控制的,默认为1小时: https://github.com/kubernetes/kubernetes/blob/da53a247633/cmd/kube-apiserver/app/options/options.go#L71-L72How long events are retained? Their "time-to-live" is apparently controlled by kube-apiserver
--event-ttl
option, defaults to 1 hour: https://github.com/kubernetes/kubernetes/blob/da53a247633/cmd/kube-apiserver/app/options/options.go#L71-L72您可以提出这个问题.可能需要更多资源用于
etcd
—从我在2015年的github讨论中看到的情况来看,事件TTL过去是2天,而事件是强调etcd
...You can raise this. Might require more resources for
etcd
— from what I saw in some 2015 github discussions, event TTL used to be 2 days, and events were the main thing stressingetcd
...在紧要关头,也许有可能从各种日志(尤其是kubelet日志)中找出较早发生的事情?
In a pinch, it might be possible to figure out what happened earlier from various log, especially the kubelet logs?
-
将
kubectl get event -o yaml --watch
运行到持久文件中听起来很简单.我认为,当您看到事件到达时,就会看到它们是预先分散的.
Running
kubectl get event -o yaml --watch
into a persistent file sounds like a simple thing to do. I think when you watch events as they arrive, you see them pre-dedup.
堆可以将事件发送到某些受支持的接收器: https://github.com/kubernetes/heapster/blob/master/docs/sink-configuration.md
Heapster can send events to some of the supported sinks: https://github.com/kubernetes/heapster/blob/master/docs/sink-configuration.md
这篇关于Kubernetes事件的时间表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-