按项目频率计数排序结果 [英] sort result by term frequency count

查看:177
本文介绍了按项目频率计数排序结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果有2个文件中有世界这个单词5次& 2次。



所以我想要的文字有world5次,首先被列出,其次是文字world2次。



>解决方案

我不认为有任何需要排序。如果您有上述文件,并且您正在搜索出现多于一个,两个或三个的特定单词,则弹性搜索将自动计算其分数,并通过分数排序返回文档。



要尝试这个文件:

  curl -XPUThttp:// localhost :9200 / movies / movie / 1-d'
{
title:教父,
导演:弗朗西斯·福特·科波拉,
年:1972年,
流派:[
犯罪,
戏剧
]
}'

curl -XPUT http:// localhost:9200 / movies / movie / 2-d'
{
title:教父教父,
导演:弗朗西斯·福特·科波拉 ,
年:1972年,
流派:[
犯罪,
戏剧
]
}'

curl -XPUThttp:// localhost:9200 / movies / movie / 3-d'
{
title:教父教父教父,
导演:弗朗西斯·福特·科波拉,
年:1972年,
流派:[
犯罪,
戏剧
]
}'

摄取后运行此查询并查看结果:

  curl  - XPOSThttp:// localhost:9200 / movies / _search-d'
{
explain:true,
query:{
filtered
query:{
query_string:{
query:godfather
}
}
}
}
}'

这将把文档三回到顶端,因为它有多个教父


If there are 2 documents which have word "world" in them 5 times & 2 times respectively.

So I want the document which has word "world" 5 times to be listed first followed by document which has word "world" 2 times.

How do i sort this?

Thanks.

解决方案

I don't think there is any need to sort it. If you have documents as you mentioned, and you are searching a particular word which is appearing more then one, two or three in your case, elastic search will calculate its score automatically and would return the document by score sorting.

To try this ingest some documents:

curl -XPUT "http://localhost:9200/movies/movie/1" -d'
{
  "title": "The Godfather",
  "director": "Francis Ford Coppola",
  "year": 1972,
  "genres": [
    "Crime",
    "Drama"
  ]
}'

curl -XPUT "http://localhost:9200/movies/movie/2" -d'
{
  "title": "The Godfather Godfather",
  "director": "Francis Ford Coppola",
  "year": 1972,
  "genres": [
    "Crime",
    "Drama"
  ]
}'

curl -XPUT "http://localhost:9200/movies/movie/3" -d'
{
  "title": "The Godfather Godfather Godfather",
  "director": "Francis Ford Coppola",
  "year": 1972,
  "genres": [
    "Crime",
    "Drama"
  ]
}'

After ingestion run this query and see the result:

curl -XPOST "http://localhost:9200/movies/_search" -d'
{
  "explain": true,
  "query": {
    "filtered": {
      "query": {
        "query_string": {
          "query": "godfather"
        }
      }
    }
  }
}'

This will return the document three on top because it has "godfather" multiple time

这篇关于按项目频率计数排序结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆