Elasticsearch-显示基于给定条件的每个返回结果的索引范围内的计数 [英] Elasticsearch - Show index-wide count for each returned result based from a given term

查看:121
本文介绍了Elasticsearch-显示基于给定条件的每个返回结果的索引范围内的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我道歉,因为我每天都在学习Elasticsearch并可能使用了不正确的短语,所以我使用的术语不正确.

花了几天时间试图弄清楚这个问题并拔掉头发后,我似乎每次都撞墙.

我正试图让elasticsearch为每个返回的结果提供一个文档计数,我将在下面提供一个示例.


{
  "suggest": {
    "text": "aberdeen",
    "city": {
      "completion": {
        "field": "city_suggest",
        "size": "2"
      }
    },
    "street": {
      "completion": {
        "field": "street_suggest",
        "size": "2"
      }
    }
  },
  "size": 0,
  "aggs": {
    "meta": {
      "filter": {
        "term": {
          "city.raw": "aberdeen"
        }
      },
      "aggs": {
        "name": {
          "terms": {
            "field": "city.raw"
          }
        }
      }
    }
  }
}


上面的查询返回以下结果:

{
  "took": 37,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1870535,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "meta": {
      "doc_count": 119196,
      "name": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "Aberdeen",
            "doc_count": 119196
          }
        ]
      }
    }
  },
  "suggest": {
    "city": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Aberdeen",
            "score": 100
          }
        ]
      }
    ],
    "street": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Davidson House, Aberdeen, AB15",
            "score": 80
          },
          {
            "text": "Bruce House, Aberdeen, AB15",
            "score": 80
          }
        ]
      }
    ]
  }
}


我想要达到的结果是对每个返回结果进行总体文档计数,因此,例如,返回的街道地址"Davidson House, Aberdeen, AB15"表示索引中有多少文档与该给定地址匹配,因此将重复该操作对于每个结果而言,对于城市来说都是相同的,其方式类似于汇总城市当前显示总体计数的方式.

  {
    "key": "Aberdeen",
    "doc_count": 119196
  }

这是生产中类似物品的一个例子


我认为我面临聚合的问题是我不知道将要返回的值,否则我可以像我对城市那样用聚合来预定义它们,从而以这种方式要求每个给定结果的总数. /p>

为了提供一个有关如何描绘结果的整体示例,我将展示如何描绘可能的工作结果,例如:

"suggest": {
    "city": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Aberdeen",
            "score": 100,
            "total_addresses": 196152
          }
        ]
      }
    ],
    "street": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Davidson House, Aberdeen, AB15",
            "score": 80,
            "total_addresses": 158
          },
          {
            "text": "Bruce House, Aberdeen, AB15",
            "score": 80,
            "total_addresses": 30
          }
        ]
      }
    ]
  }

就我使用的Elasticsearch版本而言,我有两个运行Elasticsearch 2.3和5.5的开发服务器,以查看较新版本的Elasticsearch是否会有所作为,但不幸的是,我言之凿凿,所以我一直使用2.3来支持5.5

任何帮助或建议,将不胜感激,谢谢.

解决方案

,您需要将查询一分为二.首先使用建议API收集建议,然后对结果运行汇总.该解决方案的缺点是,您会疯狂地提出快速建议(如果幸运的话,建议不到一毫秒),而反对运行更长的聚合.如果可以,那么这可能是个好方法.

另一个想法可能是拥有一个自己的带有预汇总数据的建议索引,该索引包含这样的计数-该索引在后台按规则重新创建.

Firstly i apologise if the terminology i use is incorrect as i am learning elasticsearch day by day and maybe use incorrect phrases.

After spending several days trying to figure this out and pulling my hair out i seem to be hitting brick walls every-time.

I am trying to get elasticsearch to provide a document count for each returned result, I will provide an example below..


{
  "suggest": {
    "text": "aberdeen",
    "city": {
      "completion": {
        "field": "city_suggest",
        "size": "2"
      }
    },
    "street": {
      "completion": {
        "field": "street_suggest",
        "size": "2"
      }
    }
  },
  "size": 0,
  "aggs": {
    "meta": {
      "filter": {
        "term": {
          "city.raw": "aberdeen"
        }
      },
      "aggs": {
        "name": {
          "terms": {
            "field": "city.raw"
          }
        }
      }
    }
  }
}


The above query returns the following results:

{
  "took": 37,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1870535,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "meta": {
      "doc_count": 119196,
      "name": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "Aberdeen",
            "doc_count": 119196
          }
        ]
      }
    }
  },
  "suggest": {
    "city": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Aberdeen",
            "score": 100
          }
        ]
      }
    ],
    "street": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Davidson House, Aberdeen, AB15",
            "score": 80
          },
          {
            "text": "Bruce House, Aberdeen, AB15",
            "score": 80
          }
        ]
      }
    ]
  }
}


The result i am trying to achieve is to have an overall document count of each returned result so for example, The returned street address of "Davidson House, Aberdeen, AB15" would say how many documents in the index match this given address and this would be repeated for each result and the same for the city in a similar way to how the aggregated city currently shows the overall count.

  {
    "key": "Aberdeen",
    "doc_count": 119196
  }

Here is an example of something similar in production


The problem i believe i have faced with aggregations is i do not know the values that are going to be returned otherwise i could predefine them with aggregations like i did the city thus requesting the overall count of each given result that way.

To help give an overall example of how i pictured the results to be i will show how i pictured that possible working results to be like:

"suggest": {
    "city": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Aberdeen",
            "score": 100,
            "total_addresses": 196152
          }
        ]
      }
    ],
    "street": [
      {
        "text": "Aberdeen",
        "offset": 0,
        "length": 8,
        "options": [
          {
            "text": "Davidson House, Aberdeen, AB15",
            "score": 80,
            "total_addresses": 158
          },
          {
            "text": "Bruce House, Aberdeen, AB15",
            "score": 80,
            "total_addresses": 30
          }
        ]
      }
    ]
  }

En terms of the elasticsearch version i am using, I have two dev servers running elasticsearch 2.3 and 5.5 to see if the newer version of elasticsearch would make a difference and unfortunately i came up short so i have been using 2.3 in favour of 5.5

Any help or advice would be greatly appreciated, Thanks all.

解决方案

you need to divide your query in two. First use the suggest API to gather suggestions, then run the aggregation on the result. The drawback of this solution would be, that you have a crazy fast suggestion (less than a millisecond, if you're lucky), against a longer running aggregation. If thats ok for you, this might be a good approach.

Another idea might be to have an own suggestion index with preaggregated data, that contains such a count - this index gets recreated regurlarly in the background.

这篇关于Elasticsearch-显示基于给定条件的每个返回结果的索引范围内的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆