elasticsearch - 返回字段的标记 [英] elasticsearch - Return the tokens of a field

查看:30
本文介绍了elasticsearch - 返回字段的标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在结果中返回特定字段的标记

How can I have the tokens of a particular field returned in the result

例如,一个 GET 请求

For example, A GET request

curl -XGET 'http://localhost:9200/twitter/tweet/1'

返回

{
    "_index" : "twitter",
    "_type" : "tweet",
    "_id" : "1", 
    "_source" : {
        "user" : "kimchy",
        "postDate" : "2009-11-15T14:12:12",
        "message" : "trying out Elastic Search"
    } 
}

我希望结果中包含_source.message"字段的标记

I would like to have the tokens of '_source.message' field included in the result

推荐答案

还有另一种方法可以使用以下 script_fields 脚本:

There is also another way to do it using the following script_fields script:

curl -H 'Content-Type: application/json' -XPOST 'http://localhost:9200/test-idx/_search?pretty=true' -d '{
    "query" : {
        "match_all" : { }
    },
    "script_fields": {
        "terms" : {
            "script": "doc[field].values",
            "params": {
                "field": "message"
            }
        }

    }
}'

请务必注意,虽然此脚本返回已编入索引的实际术语,但它还缓存所有字段值,并且在大型索引上可能会使用大量内存.因此,对于大型索引,使用以下 MVEL 脚本从存储的字段或源中检索字段值并重新解析它们可能更有用:

It's important to note that while this script returns the actual terms that were indexed, it also caches all field values and on large indices can use a lot of memory. So, on large indices, it might be more useful to retrieve field values from stored fields or source and reparse them again on the fly using the following MVEL script:

import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import java.io.StringReader;

// Cache analyzer for further use
cachedAnalyzer=(isdef cachedAnalyzer)?cachedAnalyzer:doc.mapperService().documentMapper(doc._type.value).mappers().indexAnalyzer();

terms=[];
// Get value from Fields Lookup
//val=_fields[field].values;

// Get value from Source Lookup
val=_source[field];

if(val != null) {
  tokenStream=cachedAnalyzer.tokenStream(field, new StringReader(val)); 
  CharTermAttribute termAttribute = tokenStream.addAttribute(CharTermAttribute); 
  while(tokenStream.incrementToken()) { 
    terms.add(termAttribute.toString())
  }; 
  tokenStream.close(); 
} 
terms

这个 MVEL 脚本可以存储为 config/scripts/analyze.mvel 并用于以下查询:

This MVEL script can be stored as config/scripts/analyze.mvel and used with the following query:

curl 'http://localhost:9200/test-idx/_search?pretty=true' -d '{
    "query" : {
        "match_all" : { }
    },
    "script_fields": {
        "terms" : {
            "script": "analyze",
            "params": {
                "field": "message"
            }
        }
    
    }
}'

这篇关于elasticsearch - 返回字段的标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆