ElasticSearch:从存在的所有文档中删除字段(使用Painless?) [英] ElasticSearch: Delete field from all documents where it exists (with Painless?)

查看:153
本文介绍了ElasticSearch:从存在的所有文档中删除字段(使用Painless?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情况:我有一个带有严格映射的索引,我想从中删除一个不再使用的旧字段.因此,我使用映射创建了一个不包含该字段的新索引,然后尝试将数据重新索引到新索引中.

Situation: I have an index with strict mapping and I want to delete an old field from it which is no longer used. So I create a new index with mapping that doesn't include that field and I try to reindex the data into the new index.

问题:重新索引时,出现错误,因为我试图将数据索引到映射中不可用的字段中.因此,要解决此问题,我想先重新删除原始索引中所有文档中的该字段,然后才能重新索引.

Problem: When I reindex, I get an error, because I'm trying to index data into a field that is not available in the mapping. So to solve this, I want to remove that field from all documents in the original index first, before I can reindex.

PUT old_index/_doc/1
{
    "field_to_delete" : 5
}
PUT old_index/_doc/2
{
    "field_to_delete" : null
}

POST _reindex
{
  "source": {
    "index": "old_index"
  },
  "dest": {
    "index": "new_index"
  }
}

"reason": "mapping set to strict, dynamic introduction of [field_to_delete] within [new_index] is not allowed"

1.我发现有些地方建议这样做:

POST old_index/_doc/_update_by_query
{
  "script": "ctx._source.remove('field_to_delete')",
  "query": {
    "bool": {
      "must": [
        {
          "exists": {
            "field": "field_to_delete"
          }
        }
      ]
    }
  }
}

但是,这与显式值为 null 的文档不匹配,因此在此更新之后重新索引仍然会失败.

However that doesn't match documents that have an explicit value of null, so reindexing still fails after this update.

2.其他人(例如在其官方论坛中的Elastic团队成员)建议采取以下措施:

POST old_index/_doc/_update_by_query
{
  "script": {
    "source": """
          if (ctx._source.field_to_delete != null) {
            ctx._source.remove("field_to_delete");
          } else {
            ctx.op="noop";
          }
      """
    }
  },
  "query": {
    "match_all": {}
  }
}

但是,这也存在相同的问题-它不会删除具有显式值为 null 的第二个文档.

However this has the same problem - it doesn't remove the second document that has an explicit value of null.

3.最后我可以做:

POST old_index/_doc/_update_by_query
{
  "script": {
    "source": "ctx._source.remove("field_to_delete");"}
  },
  "query": {
    "match_all": {}
  }
}

但是这将更新所有文档,并且对于较大的索引可能意味着部署期间的额外停机时间.

But this will update all documents and for a large index could mean additional downtime during deployment.

推荐答案

最终,我找到了正确的方法,因此我将其分享为常识:

Eventually I found the correct way to do it, so I'm sharing it for the general knowledge:

POST old_index/_doc/_update_by_query
{
  "script": {
    "source": """
        if (ctx._source.containsKey("field_to_delete")) {
            ctx._source.remove("field_to_delete");
        } else {
          ctx.op="noop";
        }
      """
  },
  "query": {
    "match_all": {}
  }
}

这篇关于ElasticSearch:从存在的所有文档中删除字段(使用Painless?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆