弹性搜索不同的过滤器值 [英] Elasticsearch distinct filter values

查看：123 发布时间：2017/8/7 3:40:08 filter elasticsearch nosql distinct

本文介绍了弹性搜索不同的过滤器值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在弹性搜索中有一个大文件存储，并且想要检索显示在HTML下拉列表中的不同的过滤器值。

一个例子就是像/ p>

 [
 {
name：John Doe，
deparments：[
 { 
name：Accounts
}，
 {
name：管理
} 
] 
}，
 {
name：Jane Smith，
deparments：[
 {
name：IT
}，
 {
name：管理
} 
] 
} 
]

下拉列表应该有一个部门列表，即IT，帐户和管理。

有些人请指出正确的方向，从弹性搜索中检索不同的部门清单？

感谢

解决方案

这是一个条款的工作 a href =http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation =nofollow >文档）。

您可以拥有独特的部门这样的值：

  POST公司/员工/ _search 
 {
size：0，
aggs {
by_departments：{
terms：{
field：departments.name，
size：0 // see note 1 
} 
} 
} 
}

输出：

  {
 ... 
聚合：{
 by_departmen ts：{
buckets：[
 {
key：管理，//参见注释2 
doc_count：2 
} 
 {
key：accounts，
doc_count：1 
}，
 {
key：it b $ bdoc_count：1 
} 
] 
} 
} 
}

另外两个注释：

设置 size 为0将把最大值设置为Integer.MAX_VALUE。不要使用它，如果有太多的部门不同的值。

你可以看到钥匙是分析部门的条款。确保在映射为 not_analyzed 的字段上使用术语聚合。

 
  
 
 例如，使用我们的默认映射（ departments.name 是一个分析 string），添加此员工：
  {
name：Bill Gates，
department：[
 {
name：IT
}，
 {
name：人力资源
 } 
] 
} 
  
将导致这种结果：
  {
 ... 
聚合：{
by_departments：{
buckets：[
 {
key：it，
doc_count：2 
}，
 {
key ：管理，
doc_count：2 
}，
 {
key：accounts，
doc_count：1 
 }，
 {
key：human，
doc_count：1 
}，
 {
key：resource，
 doc_count：1 
} 
] 
} 
} 
} 
  
使用正确的映射：
  POST公司
 {
映射：{
employee：{
properties：{
name：{
type：string
}，
department：{
type：object，
properties：{
name：{
type：string，
index：not_analyzed
} 
} 
} 
} 
} 
} 
} 
  
同样的请求最终输出：
  {
 ... 
aggregate：{
by_departments：{
buckets：[
 {
key：IT，
doc_count：2 
}，
 {
key：管理，
 doc_count：2 
}，
 {
key：Accounts，
doc_count：1 
}，
 {
key：人力资源，
doc_count：1 
} 
] 
} 
} 
} 
  
希望这有帮助！
 
I have a large document store in elasticsearch and would like to retrieve the distinct filter values for display on HTML drop-downs.

An example would be something like 
[
    {
        "name": "John Doe",
        "deparments": [
            {
                "name": "Accounts"
            },
            {
                "name": "Management"
            }
        ]
    },
    {
        "name": "Jane Smith",
        "deparments": [
            {
                "name": "IT"
            },
            {
                "name": "Management"
            }
        ]
    }
]
The drop-down should have a list of departments, i.e. IT, Account and Management.  

Would some kind person please point me in the right direction for retrieving a distinct list of departments from elasticsearch?

Thanks
 解决方案 
This is a job for a terms aggregation (documentation).

You can have the distinct departments values like this :
POST company/employee/_search
{
  "size":0,
  "aggs": {
    "by_departments": {
      "terms": {
        "field": "departments.name",
        "size": 0 //see note 1
      }
    }
  }
}
Which, in your example, outputs :
{
   ...
   "aggregations": {
      "by_departments": {
         "buckets": [
            {
               "key": "management", //see note 2
               "doc_count": 2
            },
            {
               "key": "accounts",
               "doc_count": 1
            },
            {
               "key": "it",
               "doc_count": 1
            }
         ]
      }
   }
}
Two additional notes :


setting size to 0 will set the maximum buckets number to Integer.MAX_VALUE. Don't use it if there are too many departments distinct values.
you can see that the keys are terms resulting of analyzing departments values. Be sure to use your terms aggregation on a field mapped as not_analyzed .


For example, with our default mapping (departments.name is an analyzed string), adding this employee:
{
  "name": "Bill Gates",
  "departments": [
    {
      "name": "IT"
    },
    {
      "name": "Human Resource"
    }
  ]
}
will cause this kind of result:
{
   ...
   "aggregations": {
      "by_departments": {
         "buckets": [
            {
               "key": "it",
               "doc_count": 2
            },
            {
               "key": "management",
               "doc_count": 2
            },
            {
               "key": "accounts",
               "doc_count": 1
            },
            {
               "key": "human",
               "doc_count": 1
            },
            {
               "key": "resource",
               "doc_count": 1
            }
         ]
      }
   }
}
With a correct mapping :
POST company
{
  "mappings": {
    "employee": {
      "properties": {
        "name": {
          "type": "string"
        },
        "departments": {
          "type": "object",
          "properties": {
            "name": {
              "type": "string",
              "index": "not_analyzed"
            }
          }
        }
      }
    }
  }
}
The same request ends up outputting :
{
   ...
   "aggregations": {
      "by_departments": {
         "buckets": [
            {
               "key": "IT",
               "doc_count": 2
            },
            {
               "key": "Management",
               "doc_count": 2
            },
            {
               "key": "Accounts",
               "doc_count": 1
            },
            {
               "key": "Human Resource",
               "doc_count": 1
            }
         ]
      }
   }
}
Hope this helps!

                        这篇关于弹性搜索不同的过滤器值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

弹性搜索不同的过滤器值 [英] Elasticsearch distinct filter values

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

弹性搜索不同的过滤器值 [英] Elasticsearch distinct filter values

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭