弹性搜索计数忽略空格 [英] Elasticsearch count terms ignoring spaces

查看:130
本文介绍了弹性搜索计数忽略空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用ES 1.2.1

Using ES 1.2.1

我的聚合

{
    "size": 0,
    "aggs": {
        "cities": {
            "terms": {
                "field": "city","size": 300000
            }
     }
 }

}

问题是一些城市名称中有空格并单独汇总。

The issue is that some city names have spaces in them and aggregate separately.

例如洛杉矶



For instance Los Angeles

{
    "key": "Los",
    "doc_count": 2230
},
{
    "key": "Angeles",
    "doc_count": 2230
},

我认为它与分析仪有关系?

I assume it has to do with the analyzer? Which one would I use to not split on spaces?

推荐答案

对于要执行聚合的字段,我建议使用关键字分析器或根本不分析字段。从关键字分析器文档中:

For fields that you want to perform aggregations on I would recommend either the keyword analyzer or do not analyze the field at all. From the keyword analyzer documentation:


一个类型关键字的分析器,将整个流标记为为一个标记。这对于诸如邮政编码,ids等的数据很有用。请注意,使用映射定义时,将字段标记为not_analyzed可能更有意义。

An analyzer of type keyword that "tokenizes" an entire stream as a single token. This is useful for data like zip codes, ids and so on. Note, when using mapping definitions, it might make more sense to simply mark the field as not_analyzed.

但是,如果您仍然要执行分析领域包括其他搜索,然后考虑使用ES 1.x的字段设置如字段/多字段文档。这将允许您具有搜索字段的值和一个用于聚合的值。

However if you want to still perform analysis on the field to include for other searches, then consider using the field setting of ES 1.x As described in the field/multi_field documentation. This will allow you to have a value of the field for searching and one for aggregations.

这篇关于弹性搜索计数忽略空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆