弹性搜索 - 返回字段的标记 [英] elasticsearch - Return the tokens of a field

查看:114
本文介绍了弹性搜索 - 返回字段的标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在结果中返回特定字段的令牌

How can I have the tokens of a particular field returned in the result

例如,A GET请求

curl -XGET 'http://localhost:9200/twitter/tweet/1'

返回

{
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1", 
    "_source" : {
        "user" : "kimchy",
        "postDate" : "2009-11-15T14:12:12",
        "message" : "trying out Elastic Search"
    } 
}

我想在结果中包含'_source.message'字段的标记

I would like to have the tokens of '_source.message' field included in the result

推荐答案

还有另一种方法可以使用以下script_fields脚本:

There is also another way to do it using the following script_fields script:

curl 'http://localhost:9200/test-idx/_search?pretty=true' -d '{
    "query" : {
        "match_all" : { }
    },
    "script_fields": {
        "terms" : {
            "script": "doc[field].values",
            "params": {
                "field": "message"
            }
        }

    }
}'

重要的是要注意,虽然此脚本返回实际的条款,被索引,它也缓存所有的字段值,而大的索引可以使用大量的内存。因此,在大型索引上,使用以下MVEL脚本即可从存储字段或源中检索字段值并再次重新分析它们可能会更有用:

It's important to note that while this script returns the actual terms that were indexed, it also caches all field values and on large indices can use a lot of memory. So, on large indices, it might be more useful to retrieve field values from stored fields or source and reparse them again on the fly using the following MVEL script:

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import java.io.StringReader;

// Cache analyzer for further use
cachedAnalyzer=(isdef cachedAnalyzer)?cachedAnalyzer:doc.mapperService().documentMapper(doc._type.value).mappers().indexAnalyzer();

terms=[];
// Get value from Fields Lookup
//val=_fields[field].values;

// Get value from Source Lookup
val=_source[field];

if(val != null) {
  tokenStream=cachedAnalyzer.tokenStream(field, new StringReader(val)); 
  CharTermAttribute termAttribute = tokenStream.addAttribute(CharTermAttribute); 
  while(tokenStream.incrementToken()) { 
    terms.add(termAttribute.toString())
  }; 
  tokenStream.close(); 
} 
terms

此MVEL脚本可以存储为 config / scripts / analyze.mvel 并使用以下查询:

This MVEL script can be stored as config/scripts/analyze.mvel and used with the following query:

curl 'http://localhost:9200/test-idx/_search?pretty=true' -d '{
    "query" : {
        "match_all" : { }
    },
    "script_fields": {
        "terms" : {
            "script": "analyze",
            "params": {
                "field": "message"
            }
        }

    }
}'

这篇关于弹性搜索 - 返回字段的标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆