用于检测陈旧数据的可扩展算法 [英] Scalable algorithm to detect stale data
问题描述
问题出在这里:
安装在许多不同服务器上的代理"每5秒将心跳"信号发送到中央服务器.我该如何找到主动失去心跳超过10秒并发出警报的人?
"Agents" installed on many different servers send "heartbeat" signals up to a central server every 5 seconds. How can I find the ones that have missed their heartbeat for more than 10 seconds actively and raise an alert?
如果您不考虑可扩展性,那么问题很简单.以最简单的形式,您可以在数据库表中记录从每个代理收到的最新心跳的时间戳,并运行常规查询以查找早于阈值的时间戳.
The problem is simple if you don't think about scalablity. In the simplest form, you can record the timestamp of the latest heartbeat received from each agent in a database table and run a regular query to find the ones older than the threshold.
但是,该解决方案无法扩展到数百万个代理.
This solution is however not scalable to millions of agents.
我正在寻找使之成为可能的算法或技术.
I am looking for algorithms or techologies that make this possible.
推荐答案
- 使用地图:AgentId-> LastHearbeatTime
- 使用11组(假设分辨率为1秒就足够了),每组都在1秒的窗口中保存报告的特工ID.
每次座席报告心跳: 1.在地图上找到它 2.从相关集中将其删除 3.在地图上更新 4.将其添加到相关集合中
Every time an agent report a hearthbeat: 1. Find it in the map 2. Delete it from the relevant set 3. Update it in the map 4. Add it to the relevant set
定义线程:每秒一次,最旧的设置过期.它应该是空的.如果不是,则包含未报告的代理商ID.一组过期后,您可以重复使用它(一组循环的数组).
Define a thread: Once per second, the oldest set expires. It should be empty. If it doesn't - it contains Ids of agents which did not report. Once a set expires, you can reuse it (cyclic array of sets).
我相信它可以不带锁(可能需要12套)来实现.
I believe it can be implemented without locks (maybe you'll need 12 sets).
这篇关于用于检测陈旧数据的可扩展算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!