如何考虑分布式系统中的时钟偏移？ [英] How to account for clock offsets in a distributed system?

查看：216 发布时间：2020/10/2 21:01:18 time synchronization distributed-system clock

本文介绍了如何考虑分布式系统中的时钟偏移？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个由几个分布式服务组成的系统，每个服务都在不断生成事件并将这些事件报告给中央服务。

I have a system consisting of several distributed services, each of which is continuously generating events and reporting these to a central service.

我需要呈现事件的统一时间轴，其中时间轴中的顺序与事件发生的时间相对应。事件发生的频率和网络延迟使得我不能简单地使用到达中央收集器的时间来订购事件。

I need to present a unified timeline of the events, where the ordering in the timeline corresponds to the moment event occurred. The frequency of event occurrence and the network latency is such that I cannot simply use time of arrival at the central collector to order the events.

例如在以下情况下：

E1需要渲染在E2上方的时间轴中，尽管事后到达收集器，这意味着事件需要与时间戳记元数据一起出现。这就是问题所在。

E1 needs to be rendered in the timeline above E2, despite arriving at the collector afterwards, which means the events need to come with timestamp metadata. This is where the problem arises.

由于环境设置的限制，无法确保每台机器上的本地时间服务可靠地知道当前UTC时间。我可以假设每台机器都可以准确地测量相对时间，即时钟速度足够接近，可以使短时间间隔的测量相同，但是诸如NTP配置错误/分区之类的问题使我们无法保证每台机器都同意

Due to constraints on how the environment is set up, it is not possible to ensure that the local time services on each machine are reliably aware of current UTC time. I can assume that each machine can accurately gauge relative time, i.e. the clock speeds are close enough to make measurement of short timespans identical, but problems like NTP misconfiguration/partitioning make it impossible to guarantee that every machine agrees on the current UTC time.

这意味着只为每个事件生成本地时间戳，然后使用事件排序事件的天真方法将不起作用：

This means that a naive approach of simply generating a local timestamp for each event as it occurs, then ordering events using that will not work: every machine has its own opinion of what universal time is.

所以问题是：如何恢复时钟不一致的分布式系统中生成的事件的顺序？？

So the question is: how can I recover an ordering for events generated in a distributed system where the clocks do not agree?

我在网上找到的大多数解决方案都是尝试同步所有时钟，这对我来说是不可能的，因为：

Most solutions I find online go down the path of trying to synchronize all the clocks, which is not possible for me since:

我不愿意ntrol有问题的机器

首先时钟不同步的原因是由于网络不稳定，我无法解决

我自己的想法是每次生成事件时都要查询某种中央时间服务，然后用检索到的时间减去网络飞行时间来标记该事件。这很麻烦，因为我必须向系统中添加另一项服务并确保其可用性（如果其他服务无法达到此要求，我将回到零平方）。我希望有一些聪明的方法可以做到这一点，而无需我以这种方式集中计时。

My own idea was to query some kind of central time service every time an event is generated, then stamp that event with the retrieved time minus network flight time. This gets hairy, because I have to add another service to the system and ensure its availability (I'm back to square zero if the other services can't reach this one). I was hoping there is some clever way to do this that doesn't require me to centralize timekeeping in this way.

如何考虑分布式系统中的时钟偏移？ [英] How to account for clock offsets in a distributed system?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何考虑分布式系统中的时钟偏移？ [英] How to account for clock offsets in a distributed system?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭