ElasticSearch + Kibana - 使用预先计算的散列值的唯一计数 [英] ElasticSearch + Kibana - Unique count using pre-computed hashes

查看：676 发布时间：2017/8/7 3:28:27 elasticsearch kibana-4

本文介绍了ElasticSearch + Kibana - 使用预先计算的散列值的唯一计数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

更新：添加

我想在我的ElasticSearch群集上执行唯一的计数。
集群包含大约5千万条记录。

I want to perform unique count on my ElasticSearch cluster. The cluster contains about 50 millions of records.

我尝试了以下方法：

在本节：

预计算哈希通常是仅在非常大和/或高基数字段上有用，因为它可节省CPU和内存。

Pre-computing hashes is usually only useful on very large and/or high-cardinality fields as it saves CPU and memory.

第二种方法

在这里提到部分：

除非您将弹性搜索配置为使用doc_values作为字段数据格式，否则使用聚合和构面是

Unless you configure Elasticsearch to use doc_values as the field data format, the use of aggregations and facets is very demanding on heap space.

我的属性映射

My property mapping

"my_prop": {
  "index": "not_analyzed",
  "fielddata": {
    "format": "doc_values"
  },
  "doc_values": true,
  "type": "string",
  "fields": {
    "hash": {
      "type": "murmur3"
    }
  }
}

问题

当我在Kibana中的my_prop.hash上使用唯一的计数时，我收到以下错误：

The problem

When I use unique count on my_prop.hash in Kibana I receive the following error:

Data too large, data for [my_prop.hash] would be larger than limit

ElasticSearch有2g堆大小。
以上对于具有400万条记录的单个索引也无效。

ElasticSearch has 2g heap size. The above also fails for a single index with 4 millions of records.

我在配置中缺少某些东西？

我应该增加机器吗？这似乎不是可扩展的解决方案。

ElasticSearch查询

由Kibana生成：
http://pastebin.com/hf1yNLhE

http：/ /pastebin.com/BFTYUsVg

ElasticSearch + Kibana - 使用预先计算的散列值的唯一计数 [英] ElasticSearch + Kibana - Unique count using pre-computed hashes

问题描述

第二种方法

我的属性映射

My property mapping

问题

The problem

ElasticSearch查询

推荐答案

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

ElasticSearch + Kibana - 使用预先计算的散列值的唯一计数 [英] ElasticSearch + Kibana - Unique count using pre-computed hashes

问题描述

第二种方法

我的属性映射

My property mapping

问题

The problem

ElasticSearch查询

推荐答案

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭