从弹性搜索中获取信息,按照输入数组的顺序 [英] Retrieving information from elasticsearch, by the order of the input array

查看:147
本文介绍了从弹性搜索中获取信息,按照输入数组的顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎没有找到我的疑问的答案,所以我决定发布问题,看看有人可以帮助我。



在我的应用程序中,我有一个来自后端的ids数组,按照我的要求排序,例如:
[0] => 23,[1] => 12,[2] => 45,[3] => 21



然后,使用术语过滤器请求弹出搜索与此数组中存在的每个id相对应的信息。问题是结果不是我发送的id的顺序,所以结果混淆了,如:[0] => 21,[1] => 45,[2] => 23,[3 ] => 12



请注意,我无法通过在后端排序数组的排序来对弹性搜索进行排序。


$ b $我也不能在php中订购它们,因为我从弹性搜索中检索分页结果,所以如果每个oage都有2个结果,那么弹性搜索可以给我的信息只有[0] => 21,[1] = > 45,所以我甚至不能用php命令。



如何获取输入数组的结果?任何想法?



提前感谢

解决方案

这是一种方式您可以通过自定义脚本评分来做到这一点。



首先我创建了一些虚拟数据:

  curl -XPUT http:// localhost:9200 / test_index

curl -XPOSThttp:// localhost:9200 / test_index / _bulk-d'
{index:{_index :test_index,_type:docs,_id:1}}
{name:Document 1,id:1}
{index _index:test_index,_type:docs,_id:2}}
{name:Document 2,id:2}
{index :{_index:test_index,_type:docs,_id:3}}
{name:Document 3,id:3}
{index:{_index:test_index,_type:docs,_id:4}}
{name:Document 4,id:4}
{index:{_index:test_index,_type:docs,_id:5}}
{name:Document 5,id 5}
{index:{_index:test_index,_type:docs,_id:6}}
{name id:6}
{index:{_index:test_index,_type:docs,_id:7}}
{name 7,id:7}
{index:{_index:test_index,_type:docs,_id:8}}
{name :Document 8,id:8}
{index:{_index:test_index,_type:docs,_id:9}}
{ name:Document 9,id:9}
{index:{_index:test_index,_type:docs,_id:10}}
{name:Document 10,id:10}
'

我使用id字段,即使它是多余的,因为_ id字段被转换为字符串,并且脚本使用整数更容易。



您可以使用 ids 过滤器:

  curl -XPOSThttp:// localhost:9200 / test_index / _search-d'
{
filter:{
ids:{
type:docs,
values:[1,8,2,5]
}
}
}'

但这些不一定是按顺序哟你想要他们使用基于脚本的评分,您可以根据文档ID定义自己的排序。



这里我传递一个参数,该参数是与ids相关联的对象列表。评分脚本只需循环遍历,直到找到当前文档ID,并返回该文档的预定分数(如果未列出,则为0)。

  curl -XPOSThttp:// localhost:9200 / test_index / _search-d'
{
filter:{
ids:{
type:docs,
values:[1,8,2,5]
}
},
sort:{
_script:{
script:for(i:scoring){if(doc [\id\]。value == i.id)return i.score;} return 0;,
type:number,
params:{
scoring:[
{id:1,score },
{id:8,score:2},
{id:2,score:3},
{id分数:4}
]
},
订单:asc
}
}
}'

并以正确的顺序返回文件:

  {
take:11,
timed_out:false,
_shards:{
total:2,
success:2,
failed:0

hits:{
total:4,
max_score:null,
hits:[
{
_index:test_index,
_type:docs,
_id:1,
_score:null,
_source
name:Document 1,
id:1
},
sort:[
1
]
$,
{
_index:test_index,
_type:docs,
_id:8,
_score :null,
_source:{
name:Document 8,
id:8
},
sort
2
]
},
{
_index:test_index,
_type:docs,
_id 2,
_score:null,
_source:{
name:Document 2,
id:2
}
sort:[
3
]
},
{
_index:test_index,
_type docs,
_id:5,
_score:null,
_source:{
name:Document 5,
id:5
},
sort:[
4
]
}
]
}
}

这是一个可运行的示例: http://sense.qbox.io/gist/01b28e5c038c785f0844abb7c01a7 1d69a32a2f4


Can't seem to find an answer to my doubt, so I decided to post the question and see if someone can help me.

In my application, I have an array of ids which comes from the backend and which is ordered already as I want, for example: [0] => 23, [1] => 12, [2] => 45, [3] => 21

I then "ask" elasticsearch the information corresponding to each id present in this array, using a terms filter. The problem is the results don't come in the order of the ids I sent, so the results get mixed up, like: [0] => 21, [1] => 45, [2] => 23, [3] => 12

Note that I can't sort in elasticsearch by the sorting that orders the array in the backend.

I also can't order them in php as I'm retrieving paginated results from elasticsearch, so if each oage had 2 results, elasticsearch could give me the info only for [0] => 21, [1] => 45, so I can't even order them with php.

How can I get the results ordered by the input array? Any ideas?

Thanks in advance

解决方案

Here is one way you can do it, with custom scripted scoring.

First I created some dummy data:

curl -XPUT "http://localhost:9200/test_index"

curl -XPOST "http://localhost:9200/test_index/_bulk " -d'
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 1 } }
{ "name" : "Document 1", "id" : 1 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 2 } }
{ "name" : "Document 2", "id" : 2 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 3 } }
{ "name" : "Document 3", "id" : 3 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 4 } }
{ "name" : "Document 4", "id" : 4 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 5 } }
{ "name" : "Document 5", "id" : 5 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 6 } }
{ "name" : "Document 6", "id" : 6 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 7 } }
{ "name" : "Document 7", "id" : 7 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 8 } }
{ "name" : "Document 8", "id" : 8 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 9 } }
{ "name" : "Document 9", "id" : 9 }
{ "index" : { "_index" : "test_index", "_type" : "docs", "_id" : 10 } }
{ "name" : "Document 10", "id" : 10 }
'

I used an "id" field even though it's redundant, since the "_id" field gets converted to a string, and the scripting is easier with integers.

You can get back a specific set of docs by id with the ids filter:

curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
   "filter": {
      "ids": {
         "type": "docs",
         "values": [ 1, 8, 2, 5 ]
      }
   }
}'

but these will not necessarily be in the order you want them. Using script based scoring, you can define your own ordering based on document ids.

Here I pass in a parameter that is a list of objects that relate ids to score. The scoring script simply loops through them until it finds the current document id and returns the predetermined score for that document (or 0 if it isn't listed).

curl -XPOST "http://localhost:9200/test_index/_search" -d'
{
   "filter": {
      "ids": {
         "type": "docs",
         "values": [ 1, 8, 2, 5 ]
      }
   },
   "sort" : {
        "_script" : {
            "script" : "for(i:scoring) { if(doc[\"id\"].value == i.id) return i.score; } return 0;",
            "type" : "number",
            "params" : {
                "scoring" : [
                    { "id": 1, "score": 1 },
                    { "id": 8, "score": 2 },
                    { "id": 2, "score": 3 },
                    { "id": 5, "score": 4 }
                ]
            },
            "order" : "asc"
        }
    }
}'

and the documents are returned in the proper order:

{
   "took": 11,
   "timed_out": false,
   "_shards": {
      "total": 2,
      "successful": 2,
      "failed": 0
   },
   "hits": {
      "total": 4,
      "max_score": null,
      "hits": [
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "1",
            "_score": null,
            "_source": {
               "name": "Document 1",
               "id": 1
            },
            "sort": [
               1
            ]
         },
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "8",
            "_score": null,
            "_source": {
               "name": "Document 8",
               "id": 8
            },
            "sort": [
               2
            ]
         },
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "2",
            "_score": null,
            "_source": {
               "name": "Document 2",
               "id": 2
            },
            "sort": [
               3
            ]
         },
         {
            "_index": "test_index",
            "_type": "docs",
            "_id": "5",
            "_score": null,
            "_source": {
               "name": "Document 5",
               "id": 5
            },
            "sort": [
               4
            ]
         }
      ]
   }
}

Here is a runnable example: http://sense.qbox.io/gist/01b28e5c038c785f0844abb7c01a71d69a32a2f4

这篇关于从弹性搜索中获取信息,按照输入数组的顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆