Apache Flink - 匹配具有相同值的字段 [英] Apache Flink - Matching Fields with the same value

查看:27
本文介绍了Apache Flink - 匹配具有相同值的字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个用例,我们需要找到暴力破解的模式,例如从同一设备和相同用户名登录 10 次失败,然后从相同用户名和相同设备成功登录.这应该在 10 分钟内发生.

We have a use case where we need to find the pattern for brute force like 10 failed logons from the same device and same username followed by a success logon from the same username and same device. This should happen within 10 mins.

假设我们有 10 个登录失败的 Windows 事件,用户 A 为用户名,B 为设备名,并且用户 A 使用相同的设备 B 成功登录,我们应该发出警报.有没有办法将 CEP 连接到满足上述用例.设备和用户名不会事先知道,字段的基数也不知道.

Let us say we have 10 login failed windows events with user A as username and B as devicename and we have a success logon from user A with the same device B, we should raise an alert.Is there any way flink CEP to meet the mentioned use case. The device and username wont be known before hand, also the cardinality of the fields are not known.

推荐答案

使用 Flink CEP(使用 Java DataStream API),您将使用类似 keyBy(event -> new Tuple2<>(event.user, event.device)) 然后将模式与该键分区流进行匹配.使用 Flink SQL 的 MATCH_RECOGNIZE,您希望 PARTITION BY user, device.

With Flink CEP (using the Java DataStream API) you would use something like keyBy(event -> new Tuple2<>(event.user, event.device)) and then match the pattern against that key-partitioned stream. With Flink SQL's MATCH_RECOGNIZE, you want to PARTITION BY user, device.

时间限制由 WITHIN 子句处理.例如:

The time constraint is handled by the WITHIN clause. For example:

PATTERN (F{10} S) WITHIN INTERVAL '10' MINUTE
DEFINE
  F.status = 'failure',
  S AS S.status = 'success'

这篇关于Apache Flink - 匹配具有相同值的字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆