当聚合值达到某个阈值时,如何设置渗滤器返回? [英] How to set up percolator to return when an aggregation value hits a certain threshold?

查看:168
本文介绍了当聚合值达到某个阈值时,如何设置渗滤器返回?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下列聚合查询为例:

  {
query:{
match_all:{}
},
aggs:{
groupBy:{
terms:{
field:CustomerName
},
aggs:{
points_sum:{
stats:{
field:TransactionAmount
}
}
}
}
},
size:0
}

我有兴趣知道什么时候任何CustomerName有一个平均的TransactionAmount(stats.avg)超过所有客户购买的一个阈值,一旦我索引一个文档,将我的平均水平提高到该阈值以上。似乎percolator是设计用于匹配文档或更多或更少的规则,但我找不到任何使用渗滤器匹配基于聚合结果的规则的好例子。



这是可能吗?渗滤器是最好的解决方案吗?还有更好的解决方案吗?感谢提前

解决方案

您可以使用 Watcher 商业产品,并定义以下手表:

  PUT _watcher / watch / transaction_alert 
{
trigger:{
schedule:{
interval:1m
}
},
input:{
search:{
request:{
indices:transactions,
类型:交易,
body:{
查询:{
match_all:{}
},
size ,
aggs:{
groupBy:{
terms:{
field:CustomerName
},
aggs:{
points_sum:{
stats:{
field:TransactionAmount
}
}
}
}
}
}
}
}
},
条件 {
script:{
inline:return ctx.payload.aggregations.groupBy.buckets.findAll {cust - > cust.points_sum.avg> = 200}
}
},
actions:{
send_email:{
email:{
至:< username> @< domainname>,
subject:客户通知 - 交易> 200,
body:附加客户的交易平均价格高于$ 200
附件:{
data.yml:{
data: {
格式:yaml
}
}
}
}
}
}
}

更新



总结:





使用Logstash还有一个更简单和更便宜的方法来实现这一点,即使 elasticsearch 输入插件不支持聚合,我可以使用 http_poller 输入插件,以便定期向Elasticsearch发送聚合查询。然后使用过滤器,您可以检查是否达到所需的阈值,最后,如果使用 电子邮件 输出插件。



配置基本上是这样的(请注意,您的上述聚合查询需要进行URL编码,并使用 source = ... 参数)。另请注意,我修改了您的查询,按照 points_sum.avg (desc)

  input {
http_poller {
urls => {
test1 =>的http://本地主机:9200 /你的折射率/ _search源=%7B%22query%22%3A%7B%22match_all%22%3A%7B%7D%7D%2C%22aggs%22%3A%7B% 22groupBy%22%3A%7B%22terms%22%3A%7B%22field%22%3A%22CustomerName%22%2C%22order%22%3A%7B%22points_sum.avg%22%3A%22desc%22%7D% 7D%2C%22aggs%22%3A%7B%22points_sum%22%3A%7B%22stats%22%3A%7B%22field%22%3A%22TransactionAmount%22%7D%7D%7D%7D%7D%2C% 22size%22%3A0%7D'
}
#每10秒检查
interval => 10
codec => json
}
}
过滤器{
split {
field => [aggregate] [groupBy] [buckets]
}
}
输出{
if [aggregations] [groupBy] [buckets] [points_sum] [avg]> 200 {
email {
to => <用户名> @<域名> 中
subject => 客户通知 - 交易> 200,
body => 客户%{[aggregate] [groupBy] [buckets] [key]}的交易平均值高于$ 200
}
}
}

同意,这是一个非常简单的实现,但它应该是有效的,你可以建立它,使其更智能,使用Logstash和你的想象力限制是天空; - )



更新2



另一个node.js工具调用弹性表也可以用来实现。


Take the following aggregation query as an example:

{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "groupBy": {
      "terms": {
        "field": "CustomerName"
      },
      "aggs": {
        "points_sum": {
          "stats": {
            "field": "TransactionAmount"
          }
        }
      }
    }
  },
  "size": 0
}

I am interested in knowing when any CustomerName has an average TransactionAmount (stats.avg) that is above some threshold for all of that customer's purchases, as soon as I index a document that would put my average above that threshold. It seems like percolator is designed for matching documents to rules, more or less, but I can't find any good examples of using percolator to match rules that are based on aggregation results.

Is this possible? Is percolator the best solution here? Is there another/better solution? Thanks in advance

解决方案

You can use the Watcher commercial product for that and define the following watch:

PUT _watcher/watch/transaction_alert
{
  "trigger": {
    "schedule": {
      "interval": "1m"
    }
  },
  "input": {
    "search": {
      "request": {
        "indices": "transactions",
        "types": "transaction",
        "body": {
          "query": {
            "match_all": {}
          },
          "size": 0,
          "aggs": {
            "groupBy": {
              "terms": {
                "field": "CustomerName"
              },
              "aggs": {
                "points_sum": {
                  "stats": {
                    "field": "TransactionAmount"
                  }
                }
              }
            }
          }
        }
      }
    }
  },
  "condition": {
    "script": {
      "inline": "return ctx.payload.aggregations.groupBy.buckets.findAll{ cust -> cust.points_sum.avg >= 200}"
    }
  },
  "actions": {
    "send_email": { 
      "email": {
        "to": "<username>@<domainname>", 
        "subject": "Customer Notification - Transaction > 200",
        "body": "The attached customers have a transaction average above $200"
        "attachments" : {
           "data.yml" : {
              "data" : {
                 "format" : "yaml" 
              }
           }
        }
      }
    }
  }
}

UPDATE

To sum up:

  • Watcher is a commercial product
  • ElastAlert doesn't support it (yet) and requires some effort to make it work

There's another much simpler and cheaper way to achieve this using Logstash. Even though the elasticsearch input plugin doesn't support aggregations, it is possible to use the http_poller input plugin in order to send an aggregation query to Elasticsearch at regular intervals. Then using a filter you can check if the desired threshold is attained or not, and finally, alert someone by email if that's the case using the email output plugin.

The configuration basically goes like this (note that your above aggregation query needs to be URL-encoded and sent to ES using the source=... parameter). Also note that I've modified your query to sort the buckets according to points_sum.avg (desc)

input {
  http_poller {
    urls => {
      test1 => 'http://localhost:9200/your-index/_search?source=%7B%22query%22%3A%7B%22match_all%22%3A%7B%7D%7D%2C%22aggs%22%3A%7B%22groupBy%22%3A%7B%22terms%22%3A%7B%22field%22%3A%22CustomerName%22%2C%22order%22%3A%7B%22points_sum.avg%22%3A%22desc%22%7D%7D%2C%22aggs%22%3A%7B%22points_sum%22%3A%7B%22stats%22%3A%7B%22field%22%3A%22TransactionAmount%22%7D%7D%7D%7D%7D%2C%22size%22%3A0%7D'
   }
   # checking every 10 seconds
   interval => 10
   codec => "json"
  }
}
filter {
  split {
    field => "[aggregations][groupBy][buckets]" 
  }
}
output {
  if [aggregations][groupBy][buckets][points_sum][avg] > 200 {
    email {
      to => "<username>@<domainname>"
      subject => "Customer Notification - Transaction > 200",
      body => "The customer %{[aggregations][groupBy][buckets][key]} has a transaction average above $200"
    }
  }
}

Agreed, this is a very simplistic implementation, but it should be working and you can build upon it to make it smarter, with Logstash and your imagination the limit is the sky ;-)

UPDATE 2

Another node.js tool call elasticwatch could also be leveraged to do this.

这篇关于当聚合值达到某个阈值时,如何设置渗滤器返回?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆