elasticsearch - 返回字段的标记 [英] elasticsearch - Return the tokens of a field
问题描述
如何在结果中返回特定字段的标记
How can I have the tokens of a particular field returned in the result
例如,一个 GET 请求
For example, A GET request
curl -XGET 'http://localhost:9200/twitter/tweet/1'
返回
{
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_source" : {
"user" : "kimchy",
"postDate" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}
}
我希望结果中包含_source.message"字段的标记
I would like to have the tokens of '_source.message' field included in the result
推荐答案
还有另一种方法可以使用以下 script_fields 脚本:
There is also another way to do it using the following script_fields script:
curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/test-idx/_search?pretty=true' -d '{
"query" : {
"match_all" : { }
},
"script_fields": {
"terms" : {
"script": "doc[field].values",
"params": {
"field": "message"
}
}
}
}'
请务必注意,虽然此脚本返回已编入索引的实际术语,但它还缓存所有字段值,并且在大型索引上可能会使用大量内存.因此,对于大型索引,使用以下 MVEL 脚本从存储的字段或源中检索字段值并重新解析它们可能更有用:
It's important to note that while this script returns the actual terms that were indexed, it also caches all field values and on large indices can use a lot of memory. So, on large indices, it might be more useful to retrieve field values from stored fields or source and reparse them again on the fly using the following MVEL script:
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import java.io.StringReader;
// Cache analyzer for further use
cachedAnalyzer=(isdef cachedAnalyzer)?cachedAnalyzer:doc.mapperService().documentMapper(doc._type.value).mappers().indexAnalyzer();
terms=[];
// Get value from Fields Lookup
//val=_fields[field].values;
// Get value from Source Lookup
val=_source[field];
if(val != null) {
tokenStream=cachedAnalyzer.tokenStream(field, new StringReader(val));
CharTermAttribute termAttribute = tokenStream.addAttribute(CharTermAttribute);
while(tokenStream.incrementToken()) {
terms.add(termAttribute.toString())
};
tokenStream.close();
}
terms
这个 MVEL 脚本可以存储为 config/scripts/analyze.mvel
并用于以下查询:
This MVEL script can be stored as config/scripts/analyze.mvel
and used with the following query:
curl 'http://localhost:9200/test-idx/_search?pretty=true' -d '{
"query" : {
"match_all" : { }
},
"script_fields": {
"terms" : {
"script": "analyze",
"params": {
"field": "message"
}
}
}
}'
这篇关于elasticsearch - 返回字段的标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!