在文档的所有字段上运行Elasticsearch Processor [英] Run Elasticsearch processor on all the fields of a document

查看:38
本文介绍了在文档的所有字段上运行Elasticsearch Processor的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对要索引到Elasticsearch中的文档的所有值进行修剪和小写

I am trying to trim and lowercase all the values of the document that is getting indexed into Elasticsearch

具有字段键的可用处理器是强制性的.这意味着只能在一个字段上使用处理器

The processors available has the field key is mandatory. This means one can use a processor on only one field

是否可以在文档的所有字段上运行处理器?

Is there a way to run a processor on all the fields of a document?

推荐答案

肯定有.使用脚本处理器,但要小心保留键,例如 _type _id 等:

There sure is. Use a script processor but beware of reserved keys like _type, _id etc:

PUT _ingest/pipeline/my_string_trimmer
{
  "description": "Trims and lowercases all string values",
  "processors": [
    {
      "script": {
        "source": """
          def forbidden_keys = [
            '_type',
            '_id',
            '_version_type',
            '_index',
            '_version'
          ];
          
          def corrected_source = [:];
          
          for (pair in ctx.entrySet()) {
            def key = pair.getKey();
            if (forbidden_keys.contains(key)) {
              continue;
            }
            def value = pair.getValue();
            
            if (value instanceof String) {
              corrected_source[key] = value.trim().toLowerCase();
            } else {
              corrected_source[key] = value;
            }
          }
          
          // overwrite the original
          ctx.putAll(corrected_source);
        """
      }
    }
  ]
}

使用示例文档进行测试:

Test with a sample doc:

POST my-index/_doc?pipeline=my_string_trimmer
{
  "abc": " DEF ",
  "def": 123,
  "xyz": false
}

这篇关于在文档的所有字段上运行Elasticsearch Processor的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆