Elasticsearch-IndicesClient.put_settings无法正常工作 [英] Elasticsearch - IndicesClient.put_settings not working

查看:320
本文介绍了Elasticsearch-IndicesClient.put_settings无法正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试更新原始索引设置。
我的初始设置如下:

  client.create(index = movies,body = {
设置:{
number_of_shards:1,
number_of_replicas:0,

analysis:{
filter:{
my_custom_stop_words:{
type: stop,
stopwords:stop_words
}
},
analyzer:{
my_custom_analyzer:{
filter:[
lowercase,
my_custom_stop_words
],
type: custom,
tokenizer:标准
}
}
}
},
映射:{
属性:{
body:{
type:文本,
analyzer: my_custom_analyzer,
search_analyzer: my_custom_analyzer,
search_quote_analyzer: my_custom_analyzer
}
}
}
},

ignore = 400


我正在尝试使用client.put_settings将同义词过滤器添加到我现有的分析器(my_custom_analyzer)中:

  client.put_settings(index ='movies',body = {
settings:{
number_of_shards:1 ,,
number_of_replicas:0,

analysis:{
analyzer:{
my_custom_analyzer:{
filter:[
小写,
my_stops,
my_synonyms
],
type: custom,
tokenizer: standard
}
} ,
filter:{
my_custom_stops:{
type: stop,
stopwords:stop_words
},
my_custom_synonyms:{
ignore_case: true,
type:同义词,
synonyms:[ Harry Potter,HP => HP,终结者,TM => TM]

}
}
}
},
映射:{
属性:{
body:{
type:文本,
analyzer: my_custom_analyzer,
search_analyzer: my_custom_analyzer,
search_quote_analyzer: my_custom_analyzer
}
}
}
},

ignore = 400


但是,当我发出搜索查询(搜索 HP)来查询电影索引时,我试图对文档,因此包含 Harry Potter 5次的文档是列表中的顶部元素。现在,似乎具有 HP 3次的文档位于列表的顶部,因此同义词过滤器不起作用。已关闭电影索引我执行client.put_settings,然后重新打开索引。
任何帮助将不胜感激!

解决方案

您应该重新索引所有数据以应用更新后的数据



已建立索引的数据将不会受到更新后的分析器的影响,只有已更新后已建立索引的文档才会受到影响。设置将会受到影响。



不重新索引数据可能会产生错误的结果,因为旧数据是使用旧的自定义分析器而不是新的自定义分析器进行分析的。 p>

解决此问题的最有效方法是创建一个新索引,并使用更新的设置将数据从旧索引移到新索引。



重新索引Api



请按照以下步骤操作:

  POST _reindex 
{
源:{
索引:电影
},
dest:{
index: new_movies
}
}

删除电影

放置电影
{
settings:{
number_of_shards:1,
number_of_replicas:0,
analysis:{
analyzer :{{
my_custom_analyzer:{
filter:[
lowercase,
my_custom_stops,
my_custom_synonyms
],
type: custom,
tokenizer: standard
}
},
filter:{
my_custom_stops: {
type: stop,
stopwords: stop_words
},
my_custom_synonyms:{
ignore_case: true ,
type:同义词,
同义词:[
Harry Potter,HP => HP,
终结者,TM => TM
]
}
}
}
},
映射:{
属性:{
body:{
type:文本,
analyzer: my_custom_analyzer,
search_analyzer: my_custom_analyzer,
search_quote_analyzer: my_custom_analyzer
}
}
}
}

POST _reindex?wait_for_completion = false
{
source:{
index: new_movies
},
dest:{
index: movies
}
}

在确认所有数据都到位后,您可以删除 new_movies 索引。删除new_movies



希望这些帮助


I am trying to update my original index settings. My initial setting looks like this:

client.create(index = "movies", body= {
        "settings": {
            "number_of_shards": 1,
            "number_of_replicas": 0,

            "analysis": {
                "filter": {
                    "my_custom_stop_words": {
                        "type": "stop",
                        "stopwords": stop_words
                    }
                },
                "analyzer": {
                    "my_custom_analyzer": {
                        "filter": [
                            "lowercase",
                            "my_custom_stop_words"
                        ],
                        "type": "custom",
                        "tokenizer": "standard"
                    }
                }
            }
        },
        "mappings": {
            "properties": {
                "body": {
                    "type": "text",
                    "analyzer": "my_custom_analyzer",
                    "search_analyzer": "my_custom_analyzer",
                    "search_quote_analyzer": "my_custom_analyzer"
                }
            }
        }
    },

    ignore=400

) 

And I am trying to add the synonym filter to my existing analyzer (my_custom_analyzer) using client.put_settings:

      client.put_settings(index='movies', body={
             "settings": {
                    "number_of_shards": 1,
                    "number_of_replicas": 0,

                    "analysis": {
                        "analyzer": {
                            "my_custom_analyzer": {
                                "filter": [
                                    "lowercase",
                                    "my_stops",
                                    "my_synonyms"
                                ],
                                "type": "custom",
                                "tokenizer": "standard"
                            }
                        },
                        "filter": {
                            "my_custom_stops": {
                                "type": "stop",
                                "stopwords": stop_words
                            },
                            "my_custom_synonyms": {
                                "ignore_case": "true",
                                "type": "synonym",
                                "synonyms": ["Harry Potter, HP => HP", "Terminator, TM => TM"]

                            }
                        }
                    }
             },
            "mappings": {
                "properties": {
                    "body": {
                        "type": "text",
                        "analyzer": "my_custom_analyzer",
                        "search_analyzer": "my_custom_analyzer",
                        "search_quote_analyzer": "my_custom_analyzer"
                    }
                }
            }
        },

        ignore=400

    ) 

However, when I issue a search query (searching for "HP") that queries the movies index and I'm trying to rank the documents so that the document containing "Harry Potter" 5 times is the top element in the list. Right now, it seems like the document with "HP" 3 times tops the list, so the synonyms filter isn't working. I've closed movies index before I do client.put_settings and then re-opened the index. Any help would be greatly appreciated!

解决方案

You should re-index all your data in order to apply the updated settings on all your data and fields.

The data that had already been indexed won't be affected by the updated analyzer, only documents that has been indexed after you updated the settings will be affected.

Not re-indexing your data might produce incorrect results since your old data is analyzed with the old custom analyzer and not with the new one.

The most efficient way to resolve this issue is to create a new index, and move your data from the old one to the new one with the updated settings.

Reindex Api

Follow these steps:

POST _reindex
{
  "source": {
    "index": "movies"
  },
  "dest": {
    "index": "new_movies"
  }
}

DELETE movies

PUT movies
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0,
    "analysis": {
      "analyzer": {
        "my_custom_analyzer": {
          "filter": [
            "lowercase",
            "my_custom_stops",
            "my_custom_synonyms"
          ],
          "type": "custom",
          "tokenizer": "standard"
        }
      },
      "filter": {
        "my_custom_stops": {
          "type": "stop",
          "stopwords": "stop_words"
        },
        "my_custom_synonyms": {
          "ignore_case": "true",
          "type": "synonym",
          "synonyms": [
            "Harry Potter, HP => HP",
            "Terminator, TM => TM"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "body": {
        "type": "text",
        "analyzer": "my_custom_analyzer",
        "search_analyzer": "my_custom_analyzer",
        "search_quote_analyzer": "my_custom_analyzer"
      }
    }
  }
}

POST _reindex?wait_for_completion=false  
{
  "source": {
    "index": "new_movies"
  },
  "dest": {
    "index": "movies"
  }
}

After you've verified all your data is in place you can delete new_movies index. DELETE new_movies

Hope these help

这篇关于Elasticsearch-IndicesClient.put_settings无法正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆