如何搜索elasticsearch case不敏感 [英] How to search elasticsearch case insensitive

查看:141
本文介绍了如何搜索elasticsearch case不敏感的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用php的客户端库进行弹性搜索。我想创建一个索引,索引一个人的 id 和他的名称,并允许用户搜索名称以非常灵活的方式(不区分大小写,搜索部分名称等)。



这是一个到目前为止我已经注释的代码片段,注释为方便起见

 <?php 

require_once(__ DIR__。'/../init.php');

$ client = new Elasticsearch\Client();
$ params = [
'index'=>'person',
'body'=> ; [
'settings'=> [
//现在的简单设置,单个碎片
'number_of_shards'=> 1,
'number_of_replicas'=> 0,
'analysis'=> [
'filter'=> [
'shingle'=> [
'type'=>'shingle'
]
],
'analyzer'=> [
'my_ngram_analyzer'=> [
'tokenizer'=> 'my_ngram_tokenizer',
]
],
//允许使用nGram
'tokenizer'=>搜索部分名称[
'my_ngram_tokenizer'=> [
'type'=> 'nGram',
'min_gram'=> 1,
'max_gram'=> 15,
'token_chars'=> ['letter','digit']
]
]
]
],
'mappings'=> [
'_default_'=> [
'properties'=> [
'person_id'=> [
'type'=> 'string',
'index'=> 'not_analyzed',
],
//个人的名字
'value'=> [
'type'=> 'string',
'analyzer'=> 'my_ngram_analyzer',
'term_vector'=> 'yes',
'copy_to'=> '组合'
],
]
],
]
]
];

//使用ngram索引创建索引`person`
$ client-> indices() - > create($ params);

//使用索引方案索引一个人
$ params = array();
$ params ['body'] = array('person_id'=>'1234','value'=>'Johnny Appleseed');
$ params ['index'] ='person';
$ params ['type'] ='type';
$ params ['id'] ='id';
$ ret = $ client-> index($ params);

//获取该文档(以证明它在那里)
$ getParams = array();
$ getParams ['index'] ='person';
$ getParams ['type'] ='type';
$ getParams ['id'] ='id';
$ retDoc = $ client-> get($ getParams);
print_r($ retDoc); //成功


//搜索该文档
$ searchParams ['index'] ='person';
$ searchParams ['type'] ='type';
$ searchParams ['body'] ['query'] ['match'] ['value'] ='J';
$ queryResponse = $ client-> search($ searchParams);
print_r($ queryResponse); // FAILURE

//吹掉索引,以便我们可以立即重新运行脚本
$ deleteParams = array();
$ deleteParams ['index'] ='person';
$ retDelete = $ client-> indices() - > delete($ deleteParams);

我已经有这个搜索功能有时工作,但我一直在用脚本来得到不区分大小写的功能按预期工作,在此过程中,脚本现在无法找到任何具有 J j 用作查询值来匹配。



任何想法可能在这里发生什么?

解决方案

要修复不区分大小写的位,我添加了

 'filter'=>  







另外,为什么它不能开始的原因是,在使用php的客户端库时,您不能创建索引,然后在同一个脚本中进行搜索。我的猜测是异常在这里发生。因此,在一个脚本中创建索引并在另一个脚本中搜索它,它应该可以工作。


I am using php's client library for elasticsearch. I'd like to create an index that indexes a person's id and his name, and allows the user to search for names in a very flexible way (case insensitive, search for partial names, etc.

Here is a code snippet of what I have so far, annotated with comments for convenience

<?php

require_once(__DIR__ . '/../init.php');

$client = new Elasticsearch\Client();
$params = [
    'index' => 'person',
    'body' => [
        'settings' => [
            // Simple setings for now, single shard
            'number_of_shards' => 1,
            'number_of_replicas' => 0,
            'analysis' => [
                'filter' => [
                    'shingle' => [
                        'type' => 'shingle'
                    ]
                ],
                'analyzer' => [
                    'my_ngram_analyzer' => [
                        'tokenizer' => 'my_ngram_tokenizer',
                    ]
                ],
                // Allow searching for partial names with nGram
                'tokenizer' => [
                    'my_ngram_tokenizer' => [
                        'type' => 'nGram',
                        'min_gram' => 1,
                        'max_gram' => 15,
                        'token_chars' => ['letter', 'digit']
                    ]
                ]
            ]
        ],
        'mappings' => [
            '_default_' => [
                'properties' => [
                    'person_id' => [
                        'type' => 'string',
                        'index' => 'not_analyzed',
                    ],
                    // The name of the person
                    'value' => [
                        'type' => 'string',
                        'analyzer' => 'my_ngram_analyzer',
                        'term_vector' => 'yes',
                        'copy_to' => 'combined'
                    ],
                ]
            ],
        ]
    ]
];

// Create index `person` with ngram indexing
$client->indices()->create($params);

// Index a single person using this indexing scheme
$params = array();
$params['body']  = array('person_id' => '1234', 'value' => 'Johnny Appleseed');
$params['index'] = 'person';
$params['type']  = 'type';
$params['id']    = 'id';
$ret = $client->index($params);

// Get that document (to prove it's in there)
$getParams = array();
$getParams['index'] = 'person';
$getParams['type']  = 'type';
$getParams['id']    = 'id';
$retDoc = $client->get($getParams);
print_r($retDoc); // success


// Search for that document
$searchParams['index'] = 'person';
$searchParams['type']  = 'type';
$searchParams['body']['query']['match']['value'] = 'J';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // FAILURE

// blow away index so that we can run the script again immediately
$deleteParams = array();
$deleteParams['index'] = 'person';
$retDelete = $client->indices()->delete($deleteParams);

I have had this search feature working at times, but I've been fussing with the script to get the case insensitive feature working as expected, and in the process, the script now fails to find any person with a J or j used as the query value to match.

Any ideas what might be going on here?

解决方案

To fix the case insensitive bit, I added

'filter' => 'lowercase',

to my ngram analyzer.

Also, the reason why it was failing to begin with is that, while using php's client library, you can't create the index then search it in the same script. My guess is something async is going on here. So create the index in one script and search it in another script, it should work.

这篇关于如何搜索elasticsearch case不敏感的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆