使用 Athena 从 AWS WAF 日志中的规则组列表中获取终止规则 [英] Using Athena to get terminatingrule from rulegrouplist in AWS WAF logs

查看:17
本文介绍了使用 Athena 从 AWS WAF 日志中的规则组列表中获取终止规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我按照这些说明来获取我的AWS WAF 数据到 Athena 表中.

I followed these instructions to get my AWS WAF data into an Athena table.

我想查询数据以查找具有 BLOCK 操作的最新请求.此查询有效:

I would like to query the data to find the latest requests with an action of BLOCK. This query works:

SELECT
  from_unixtime(timestamp / 1000e0) AS date,
  action,
  httprequest.clientip AS ip,
  httprequest.uri AS request,
  httprequest.country as country,
  terminatingruleid,
  rulegrouplist
FROM waf_logs
WHERE action='BLOCK'
ORDER BY date DESC
LIMIT 100;

我的问题是清楚地识别终止规则"- 请求被阻止的原因.例如,结果有

My issue is cleanly identifying the "terminatingrule" - the reason the request was blocked. As an example, a result has

terminatingrule = AWS-AWSManagedRulesCommonRuleSet

rulegrouplist = [
  {
    "nonterminatingmatchingrules": [],
    "rulegroupid": "AWS#AWSManagedRulesAmazonIpReputationList",
    "terminatingrule": "null",
    "excludedrules": "null"
  },
  {
    "nonterminatingmatchingrules": [],
    "rulegroupid": "AWS#AWSManagedRulesKnownBadInputsRuleSet",
    "terminatingrule": "null",
    "excludedrules": "null"
  },
  {
    "nonterminatingmatchingrules": [],
    "rulegroupid": "AWS#AWSManagedRulesLinuxRuleSet",
    "terminatingrule": "null",
    "excludedrules": "null"
  },
  {
    "nonterminatingmatchingrules": [],
    "rulegroupid": "AWS#AWSManagedRulesCommonRuleSet",
    "terminatingrule": {
      "rulematchdetails": "null",
      "action": "BLOCK",
      "ruleid": "NoUserAgent_HEADER"
    },
    "excludedrules":"null"
  }
]

我想分成一列的数据是rulegrouplist[terminationrule].ruleid,它的值为NoUserAgent_HEADER

The piece of data I would like separated into a column is rulegrouplist[terminatingrule].ruleid which has a value of NoUserAgent_HEADER

AWS 提供 有关查询嵌套 Athena 数组的有用信息,但一直无法得到我想要的结果.

AWS provide useful information on querying nested Athena arrays, but I have been unable to get the result I want.

我已将此问题视为 AWS 问题,但由于 Athena 使用 SQL 查询,因此任何具有良好 SQL 技能的人都可以解决此问题.

I have framed this as an AWS question but since Athena uses SQL queries, it's likely that anyone with good SQL skills could work this out.

推荐答案

我并不完全清楚你想要什么,但我假设你在 terminationrule 的数组元素之后不是 "null" (我也会假设如果有多个你想要第一个).

It's not entirely clear to me exactly what you want, but I'm going to assume you are after the array element where terminatingrule is not "null" (I will also assume that if there are multiple you want the first).

您链接的文档说明 rulegrouplist 列的类型是 array.它是 string 而不是复杂类型的原因是因为这一列似乎有多种不同的模式,一个例子是 terminationrule 属性是 string null",或结构/对象——无法使用 Athena 的类型系统描述的东西.

The documentation you link to say that the type of the rulegrouplist column is array<string>. The reason why it is string and not a complex type is because there seems to be multiple different schemas for this column, one example being that the terminatingrule property is either the string "null", or a struct/object – something that can't be described using Athena's type system.

不过,这不是问题.在处理 JSON 时,可以使用一整套 JSON 函数.这是结合使用 json_extract 的一种方法使用 filterelement_at 删除终止规则的数组元素 属性是字符串null";然后选择剩余元素中的第一个:

This is not a problem, however. When dealing with JSON there's a whole set of JSON functions that can be used. Here's one way to use json_extract combined with filter and element_at to remove array elements where the terminatingrule property is the string "null" and then pick the first of the remaining elements:

SELECT
  element_at(
    filter(
      rulegrouplist,
      rulegroup -> json_extract(rulegroup, '$.terminatingrule') <> CAST('null' AS JSON)
    ),
    1
  ) AS first_non_null_terminatingrule
FROM waf_logs
WHERE action = 'BLOCK'
ORDER BY date DESC

你说你想要最新的",这对我来说是模棱两可的,可能意味着第一个非空元素和最后一个非空元素.上面的查询将返回第一个非空元素,如果你想要最后一个,你可以将 element_at 的第二个参数更改为 -1(Athena 的数组索引从 1 开始,-1 是从结束).

You say you want the "latest", which to me is ambiguous and could mean both first non-null and last non-null element. The query above will return the first non-null element, and if you want the last you can change the second argument to element_at to -1 (Athena's array indexing starts from 1, and -1 is counting from the end).

返回 json 的单个 ruleid 元素:

To return the individual ruleid element of the json:

SELECT from_unixtime(timestamp / 1000e0) AS date, action, httprequest.clientip AS ip, httprequest.uri AS request, httprequest.country as country, terminatingruleid, json_extract(element_at(filter(rulegrouplist,rulegroup -> json_extract(rulegroup, '$.terminatingrule') <> CAST('null' AS JSON)  ),1), '$.terminatingrule.ruleid') AS ruleid
FROM waf_logs
WHERE action='BLOCK'
ORDER BY date DESC

这篇关于使用 Athena 从 AWS WAF 日志中的规则组列表中获取终止规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆