如何在弹性搜索文档中得到每个单词的总计数? [英] How can I get total count of each words in elasticsearch document?
本文介绍了如何在弹性搜索文档中得到每个单词的总计数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我搜索了这个问题,但找不到任何有用的答案。我想得到一个文档中每个单词的总计数,例如我的索引中有一些tweet,有一个tweet说这样的话,我真的很无聊,我想去我家的甜蜜的家。该查询应该返回这样的回复:
I searched about the question but couldn't find any useful answer. I want to get the total count for each word in a document, for example I have some tweets in my indices and there is a tweet that says something like this "It is so boring here I want to go to my home sweet home". The query should return the response like this:
It:1
is:1
so:1
boring:1
here:1
I:1
want:1
to:2
go:1
my:1
home:2
sweet:1
有可能吗? / p>
Is it possible to do that?
推荐答案
您正在寻找 术语向量
,利用分析器。就像这样做一样,您可以定义所需的任何分析器,即词干分析器将词转换为根/正常形式。
请查看文档以进一步
在:
POST so/_close
PUT so/_settings
{
"settings": {
"analysis":{
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": ["lowercase", "my_stemmer"]
}
},
"filter": {
"my_stemmer": {
"type": "stemmer",
"name": "english"
}
}
}
}
}
POST so/_open
PUT so/t1/_mapping
{
"t1": {
"properties": {
"tweet": {
"type": "string",
"store": true,
"index_analyzer": "my_analyzer"
}
}
}
}
POST so/t1/1
{"tweet": "It is so boring here I want to go to my home sweet home. So I'm bored"}
Out:
{
"_index": "so",
"_type": "t1",
"_id": "1",
"_version": 2,
"found": true,
"term_vectors": {
"tweet": {
"field_statistics": {
"sum_doc_freq": 13,
"doc_count": 1,
"sum_ttf": 17
},
"terms": {
"bore": {
"term_freq": 2,
...
},
"go": {
"term_freq": 1,
...
},
"here": {
"term_freq": 1,
...
},
"home": {
"term_freq": 2,
...
},
"i": {
"term_freq": 1,
...
},
"i'm": {
"term_freq": 1,
...
},
"is": {
"term_freq": 1,
...
},
"it": {
"term_freq": 1,
...
},
"my": {
"term_freq": 1,
...
},
"so": {
"term_freq": 2,
...
},
"sweet": {
"term_freq": 1,
...
},
"to": {
"term_freq": 2,
...
},
"want": {
"term_freq": 1,
...
}
}
}
}
}
这篇关于如何在弹性搜索文档中得到每个单词的总计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文