指定和使用带有C#NEST客户端的NGramTokenizer进行弹性搜索 [英] Specifying and using a NGramTokenizer with the C# NEST client for Elastic Search

查看:324
本文介绍了指定和使用带有C#NEST客户端的NGramTokenizer进行弹性搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

已更新以显示工作示例

Updated to show a working sample

我正在尝试对ElasticSearch中的用户名。

I am trying to do a partial search on a collection of usernames in ElasticSearch.

搜索周围已经指向我的 nGram Tokenizer 方向,但是我被妥善实施,没有得到任何结果。

Searching around has pointed me in the nGram Tokenizer direction but I am stumped at proper implementation and fail to get any results.

这是从我正在开展的项目剥离的相关代码。

This is the the relevant code stripped from the project I'm working on.

我尝试过不同的组合和搜索类型无效。

I have tried different combinations and search types to no avail.

setup.cs

var client = new ElasticClient(settings.ConnectionSettings);

// (Try and) Setup the nGram tokenizer.
var indexSettings = new IndexSettings();
var custonAnalyzer = new CustomAnalyzer();

customAnalyzer.Tokenizer = "mynGram";
customAnalyzer.Filter = new List<string> { "lowercase" };

indexSettings.Analysis.Analyzers.Add("mynGram", customAnalyzer);

indexSettings.Analysis.Tokenizers.Add("mynGram", new NGramTokenizer
                                                    {
                                                        MaxGram = 10,
                                                        MinGram = 2
                                                    });

client.CreateIndex(settings.ConnectionSettings.DefaultIndex, indexSettings);

client.MapFromAttributes<Profile>();

// Create and add a new profile object.
var profile = new Profile
                  {
                      Id = "1",
                      Username = "Russell"
                  };


client.IndexAsync(profile);

// Do search for object
var s = new SearchDescriptor<Profile>().Query(t => t.Term(c => c.Username, "russ"));

var results = client.Search<Profile>(s);

Profile.cs

public class Profile
{
    public string Id { get; set; }

    [ElasticProperty(IndexAnalyzer = "mynGram")]
    public string Username { get; set; }
}

任何提示将不胜感激。

推荐答案

n文件令牌过滤器上的es文档

    "settings" : {
        "analysis" : {
            "analyzer" : {
                "my_ngram_analyzer" : {
                    "tokenizer" : "my_ngram_tokenizer"
                }
            },
            "tokenizer" : {
                "my_ngram_tokenizer" : {
                    "type" : "nGram",
                    "min_gram" : "2",
                    "max_gram" : "3",
                    "token_chars": [ "letter", "digit" ]
                }
            }
        }
    }

需要注意的事项


  1. 你需要添加 mynGram 到您的分析仪,否则将不会使用。他们的工作原理就是这样。每个索引字段都有一个应用于它的分析器,分析器是一个标记符,后跟零个或多个令牌过滤器。你已经定义了一个很好的nGram分类器( mynGram )来使用,但是你没有在 customAnalyzer 中使用它,它是使用标准标记器。 (基本上你只是定义,但从不使用 mynGram 。)

  1. You need to add mynGram to your analyzer or it won't be used. They way it works is like this. Each indexed field has an analyzer applied to it, an analyzer is one tokenizer followed by zero or more token filters. You have defined a nice nGram tokenizer (mynGram) to use, but you did not use it in customAnalyzer, it is using the standard tokenizer. (Basically you are just defining but never using mynGram.)

你需要告诉弹性搜索您的映射中的 customAnalyzer
properties:{string_field:{type:string,index_analyzer :customAnalyzer}}

You need to tell elasticsearch to use your customAnalyzer in your mapping: "properties": {"string_field": {"type": "string", "index_analyzer": customAnalyzer" }}

您应该将 maxGram 更改为更大的数字(也许10),否则4个字母的搜索将不会完全像自动完成(或不能返回任何内容,取决于搜索时间分析器)。

You should change the maxGram to a bigger number (maybe 10), otherwise 4 letter searches will not behave exactly as autocomplete (or could return nothing, depends on the search-time analyzer).

使用 _analyze api端点来测试您的分析器。这条线应该可以工作。

Use the _analyze api endpoint to test your analyzer. Something line this should work.

curl -XGET' http://yourserver.com:9200?index_name/_analyze?analyzer=customAnalyzer ' - d'rlewis'

curl -XGET 'http://yourserver.com:9200?index_name/_analyze?analyzer=customAnalyzer' -d 'rlewis'

祝你好运!

这篇关于指定和使用带有C#NEST客户端的NGramTokenizer进行弹性搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆