弹性搜索:文档中自定义分数字段的影响评分 [英] Elasticsearch: Influence scoring with custom score field in document
问题描述
我有一组通过NLP algos从文本中提取的单词,每个文档中的每个单词都有相关的分数。
例如:
文档1:{vocab [{wtag:James Bond,rscore:2.14},
{wtag:world,rscore:0.86},
....,
{wtag:somemore,rscore:3.15}
]
}
文档2:{vocab:[{wtag:hiii ,rscore:1.34},
{wtag:world,rscore:0.94},
....,
{wtag:somemore ,rscore:3.23}
]
}
我想要code> rscore 在每个文档中匹配的 wtag
以影响 _score
给定通过ES,可能乘以或添加到 _score
,以影响最终 _score
(反过来,顺序)的结果文件。有没有办法实现这一点?
另一种方法是使用嵌套文档:
首先设置映射使 vocab
嵌套文档,这意味着每个 wtag
/ rscore
文档将作为单独的文档内部索引:
curl -XPUThttp:// localhost:9200 / myindex /-d'
{
settings:{number_of_shards:1},
mappings:{
mytype:{
properties:{
vocab:{
type:nested,
fields:{
wtag:{
type:string
},
rscore:{
type:float
}
}
}
}
}
}
}'
然后索引您的文档:
curl -XPUThttp:// localhost:9200 / myi ndex / mytype / 1-d'
{
vocab:[
{
wtag:詹姆斯·邦德,
rscore:2.14
},
{
wtag:world,
rscore:0.86
},
{
wtag :somemore,
rscore:3.15
}
]
}'
curl -XPUThttp:// localhost:9200 / myindex / mytype / 2-d'
{
vocab:[
{
wtag:hiii,
rscore:1.34
},
{
wtag:world,
rscore:0.94
},
{
wtag somemore,
rscore:3.23
}
]
}'
并运行嵌套
查询以匹配所有嵌套文档,并将 rscore
对于匹配的每个嵌套文档:
curl -XGEThttp:// localhost:9200 / myindex / mytype / _search -d'
pre>
{
query:{
nested:{
path:vocab ,
score_mode:sum,
query:{
function_score:{
query:{
match:{
vocab.wtag:james bond world
}
},
script_score:{
script:doc [\rscore\\ \\]。value
}
}
}
}
}
}'
I have a set of words extracted out of text through NLP algos, with associated score for each word in every document.
For example :
document 1: { "vocab": [ {"wtag":"James Bond", "rscore": 2.14 }, {"wtag":"world", "rscore": 0.86 }, ...., {"wtag":"somemore", "rscore": 3.15 } ] } document 2: { "vocab": [ {"wtag":"hiii", "rscore": 1.34 }, {"wtag":"world", "rscore": 0.94 }, ...., {"wtag":"somemore", "rscore": 3.23 } ] }
I want
rscore
s of matchedwtag
in each document to affect the_score
given to it by ES, maybe multiplied or added to the_score
, to influence the final_score
(in turn, order) of the resulting documents. Is there any way to achieve this?解决方案Another way of approaching this would be to use nested documents:
First setup the mapping to make
vocab
a nested document, meaning that eachwtag
/rscore
document would be indexed internally as a separate document:curl -XPUT "http://localhost:9200/myindex/" -d' { "settings": {"number_of_shards": 1}, "mappings": { "mytype": { "properties": { "vocab": { "type": "nested", "fields": { "wtag": { "type": "string" }, "rscore": { "type": "float" } } } } } } }'
Then index your docs:
curl -XPUT "http://localhost:9200/myindex/mytype/1" -d' { "vocab": [ { "wtag": "James Bond", "rscore": 2.14 }, { "wtag": "world", "rscore": 0.86 }, { "wtag": "somemore", "rscore": 3.15 } ] }' curl -XPUT "http://localhost:9200/myindex/mytype/2" -d' { "vocab": [ { "wtag": "hiii", "rscore": 1.34 }, { "wtag": "world", "rscore": 0.94 }, { "wtag": "somemore", "rscore": 3.23 } ] }'
And run a
nested
query to match all the nested documents and add up the values ofrscore
for each nested document which matches:curl -XGET "http://localhost:9200/myindex/mytype/_search" -d' { "query": { "nested": { "path": "vocab", "score_mode": "sum", "query": { "function_score": { "query": { "match": { "vocab.wtag": "james bond world" } }, "script_score": { "script": "doc[\"rscore\"].value" } } } } } }'
这篇关于弹性搜索:文档中自定义分数字段的影响评分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!