用于检测陈旧数据的可扩展算法 [英] Scalable algorithm to detect stale data

查看:95
本文介绍了用于检测陈旧数据的可扩展算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题出在这里:

安装在许多不同服务器上的代理"每5秒将心跳"信号发送到中央服务器.我该如何找到主动失去心跳超过10秒并发出警报的人?

"Agents" installed on many different servers send "heartbeat" signals up to a central server every 5 seconds. How can I find the ones that have missed their heartbeat for more than 10 seconds actively and raise an alert?

如果您不考虑可扩展性,那么问题很简单.以最简单的形式,您可以在数据库表中记录从每个代理收到的最新心跳的时间戳,并运行常规查询以查找早于阈值的时间戳.

The problem is simple if you don't think about scalablity. In the simplest form, you can record the timestamp of the latest heartbeat received from each agent in a database table and run a regular query to find the ones older than the threshold.

但是,该解决方案无法扩展到数百万个代理.

This solution is however not scalable to millions of agents.

我正在寻找使之成为可能的算法或技术.

I am looking for algorithms or techologies that make this possible.

推荐答案

  1. 使用地图:AgentId-> LastHearbeatTime
  2. 使用11组(假设分辨率为1秒就足够了),每组都在1秒的窗口中保存报告的特工ID.

每次座席报告心跳: 1.在地图上找到它 2.从相关集中将其删除 3.在地图上更新 4.将其添加到相关集合中

Every time an agent report a hearthbeat: 1. Find it in the map 2. Delete it from the relevant set 3. Update it in the map 4. Add it to the relevant set

定义线程:每秒一次,最旧的设置过期.它应该是空的.如果不是,则包含未报告的代理商ID.一组过期后,您可以重复使用它(一组循环的数组).

Define a thread: Once per second, the oldest set expires. It should be empty. If it doesn't - it contains Ids of agents which did not report. Once a set expires, you can reuse it (cyclic array of sets).

我相信它可以不带锁(可能需要12套)来实现.

I believe it can be implemented without locks (maybe you'll need 12 sets).

这篇关于用于检测陈旧数据的可扩展算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆