带有空格的Elasticsearch Nest通配符查询 [英] Elasticsearch Nest wildcard query with spaces

查看:725
本文介绍了带有空格的Elasticsearch Nest通配符查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简短版本:

我想使用Nest编写一个弹性搜索查询,以获取已被索引的完整索引项(在我的情况下为ContentIndexables作为我的自定义类型).该查询受[some string] + *(即String.StartsWith()的词条查询的约束,其中[some string]可能包含空格,也可能不包含空格.

这与CompletionSuggester不同,因为我需要检索完整的对象而不是字符串建议.

到目前为止,我已经尝试过:

当我查询不带空格的文本时,将使用下面的代码返回所需的输出.但是,如果我的搜索字词包含空格,则不会返回预期结果.

这是我搜索字段的方式:

var searchResults = _client.Search<ContentIndexable>(
            body =>
            body
                .Index(indexName)
                .Query(
                    query =>
                    query.QueryString(
                        qs => qs.
                                  OnFields(f => f.Title, f => f.TextContent)
                                  .Query(searchTerm + "*"))));

这是一个单元测试,演示了如何重现该问题:

indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "title"
        });

        indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "title that is long"
        });

        indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "title that likes"
        });

        indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "titlethat"
        });

        var searchResult = indexService.SearchUserItems(testGuid, IndexType.submission, 10, "title");
        Assert.IsNotNull(searchResult);
// this one works
        Assert.AreEqual(4, searchResult.Count());

        var searchResult2 = indexService.SearchUserItems(testGuid, IndexType.submission, 10, "title that");
        Assert.IsNotNull(searchResult2);
// this one does not!!! searchREsult2.Count() evaluates to 0
        Assert.AreEqual(2, searchResult2.Count());

如您所见,然后输入"title that",搜索返回为空,而不是我希望返回的两行.

更新: 一些更多的信息: 我在ContentIndexable类型上创建了一个索引:

public class ContentIndexable : IIndexable
{
    public Guid ContentId { get; set; }
    public string Title { get; set; }
    public string TextContent { get; set; }
}

使用以下代码:

_client.CreateIndex(
    indexName,
    descriptor =>
    descriptor.AddMapping<ContentIndexable>(
        m => m.Properties(
            p => p.Completion(s => s
                                       .Name(n => n.Title)
                                       .IndexAnalyzer("standard")
                                       .SearchAnalyzer("standard")
                                       .MaxInputLength(30)
                                       .Payloads()
                                       .PreserveSeparators()
                                       .PreservePositionIncrements())
                     .Completion(s => s.Name(n => n.TextContent)
                                          .IndexAnalyzer("standard")
                                          .SearchAnalyzer("standard")
                                          .MaxInputLength(50)
                                          .Payloads()
                                          .PreserveSeparators()
                                          .PreservePositionIncrements())
                 )));

我什至在索引或使用string.Replace(" ", @"\ ")查询时都试图转义空白,但这无济于事.

将搜索类型更改为通配符也无济于事:

var searchResults = _client.Search<ContentIndexable>(
            body =>
            body
                .Index(indexName)
                .Query(
                    query => query.Wildcard(qd => qd.OnField(f => f.Title).Value(searchTerm + "*"))));

有人知道我在做什么吗?

请注意,我的CompletionSuggester版本可使用空格,但不幸的是仅返回字符串.我需要拿出完整项目才能获取ContentId.我的CompletionSuggester实施:

public IEnumerable<string> GetAutoCompleteSuggestions(Guid userId, IndexType indexType, int size, string searchTerm)
    {
        string indexName = getIndexName(indexType, userId);

        var result = _client.Search<ContentIndexable>(
            body => body.Index(indexName)
                        .SuggestCompletion("content-suggest" + Guid.NewGuid(),
                                           descriptor => descriptor
                                                             .OnField(t => t.Title)
                                                             .Text(searchTerm)
                                                             .Size(size)));

        if (result.Suggest == null)
        {
            return new List<string>();
        }

        return (from suggest in result.Suggest
                from value in suggest.Value
                from options in value.Options
                select options.Text).Take(size);
    }

我知道我可以接受建议,获得全部价值(这将导致我期望的两个项目),然后使用我的第一种方法进行完全匹配,但这需要对ElasticSearch进行2次单独的调用(一个(用于完整的建议者,第二个用于术语查询),但理想情况下,如果可能的话,我希望不进行往返行程.

在此先感谢

解决方案

这是您如何处理Title字段问题的示例.

将映射更改为类似的内容(或使用 MultiField ,但我找不到同时将字段映射为字符串和补全的选项):

client.CreateIndex(indexName, i => i
    .AddMapping<ContentIndexable>(m => m
        .Properties(
            ps => ps
                .Completion(c => c.Name("title.completion")
                    .IndexAnalyzer("standard")
                    .SearchAnalyzer("standard")
                    .MaxInputLength(30)
                    .Payloads()
                    .PreserveSeparators()
                    .PreservePositionIncrements())
                .String(s => s.Name(x => x.Title).CopyTo("title.completion")))));

将您的SuggestCompletion更改为

var result = client.Search<ContentIndexable>(body => body
    .Index(indexName)
    .SuggestCompletion("content-suggest" + Guid.NewGuid(),
        descriptor => descriptor
            .OnField(t => t.Title.Suffix("completion"))
            .Text("title")
            .Size(10)));

QueryString

var searchResponse = client.Search<ContentIndexable>(body => body
    .Index(indexName)
    .Query(query => query
        .QueryString(
            qs => qs
                .OnFields(f => f.Title.Suffix("completion"))
                .Query("title tha" + "*")
                .MinimumShouldMatchPercentage(100))));

此解决方案的问题是,事实是我们为Title字段存储了两次数据.这就是为什么我之前提到使用 MultiField ,但是我无法通过NEST来做到这一点.

希望这会有所帮助.

Short version:

I would like to write an elastic search query using Nest to get the full indexed items (ContentIndexables in my case as my custom type) which have been indexed. The query is subject to a term query of [some string] + * (i.e. String.StartsWith() where [some string] may or may not contain spaces.

This is different to CompletionSuggester since I need to retrieve the full object and not string suggestions.

What I've tried so far:

When I query for a text without spaces, the desired output is returned using the code below. If my search term however contains spaces, it doesn't return the expected results.

Here's how I search the fields:

var searchResults = _client.Search<ContentIndexable>(
            body =>
            body
                .Index(indexName)
                .Query(
                    query =>
                    query.QueryString(
                        qs => qs.
                                  OnFields(f => f.Title, f => f.TextContent)
                                  .Query(searchTerm + "*"))));

And this is a unit test that demonstrates how to reproduce the problem:

indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "title"
        });

        indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "title that is long"
        });

        indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "title that likes"
        });

        indexService.IndexUserItemsSync(testGuid, IndexType.submission, new ContentIndexable
        {
            ContentId = Guid.NewGuid(),
            TextContent = "Some description",
            Title = "titlethat"
        });

        var searchResult = indexService.SearchUserItems(testGuid, IndexType.submission, 10, "title");
        Assert.IsNotNull(searchResult);
// this one works
        Assert.AreEqual(4, searchResult.Count());

        var searchResult2 = indexService.SearchUserItems(testGuid, IndexType.submission, 10, "title that");
        Assert.IsNotNull(searchResult2);
// this one does not!!! searchREsult2.Count() evaluates to 0
        Assert.AreEqual(2, searchResult2.Count());

As you can see, then I enter "title that", the search comes back empty instead of the two rows I would expect to return.

Update: Some more information: I create an index on my type ContentIndexable:

public class ContentIndexable : IIndexable
{
    public Guid ContentId { get; set; }
    public string Title { get; set; }
    public string TextContent { get; set; }
}

With this code:

_client.CreateIndex(
    indexName,
    descriptor =>
    descriptor.AddMapping<ContentIndexable>(
        m => m.Properties(
            p => p.Completion(s => s
                                       .Name(n => n.Title)
                                       .IndexAnalyzer("standard")
                                       .SearchAnalyzer("standard")
                                       .MaxInputLength(30)
                                       .Payloads()
                                       .PreserveSeparators()
                                       .PreservePositionIncrements())
                     .Completion(s => s.Name(n => n.TextContent)
                                          .IndexAnalyzer("standard")
                                          .SearchAnalyzer("standard")
                                          .MaxInputLength(50)
                                          .Payloads()
                                          .PreserveSeparators()
                                          .PreservePositionIncrements())
                 )));

I even tried to escape the whitespace both when I index or when I query with string.Replace(" ", @"\ ") but that didn't help.

Changing the search type to wild card didn't help either:

var searchResults = _client.Search<ContentIndexable>(
            body =>
            body
                .Index(indexName)
                .Query(
                    query => query.Wildcard(qd => qd.OnField(f => f.Title).Value(searchTerm + "*"))));

Does anyone know what I'm doing wrong?

Please note that my CompletionSuggester version works with spaces but unfortunately only returns strings. I need to get the complete item out in order to fetch the ContentId. MY CompletionSuggester implementation:

public IEnumerable<string> GetAutoCompleteSuggestions(Guid userId, IndexType indexType, int size, string searchTerm)
    {
        string indexName = getIndexName(indexType, userId);

        var result = _client.Search<ContentIndexable>(
            body => body.Index(indexName)
                        .SuggestCompletion("content-suggest" + Guid.NewGuid(),
                                           descriptor => descriptor
                                                             .OnField(t => t.Title)
                                                             .Text(searchTerm)
                                                             .Size(size)));

        if (result.Suggest == null)
        {
            return new List<string>();
        }

        return (from suggest in result.Suggest
                from value in suggest.Value
                from options in value.Options
                select options.Text).Take(size);
    }

I know I can take the suggestions, get the full value out (which will result in the two items I'm expecting) and then do a full term match using my first method but that requires 2 separate calls into ElasticSearch (one for complete suggestor and the second one for the term query) but ideally I would like to do it without the round trip if possible.

Many thanks in advance,

解决方案

This is example how you can deal with your problem for Title field.

Change your mapping to something like(or use MultiField, but I couldn't find option to map field as string and completion in the same time):

client.CreateIndex(indexName, i => i
    .AddMapping<ContentIndexable>(m => m
        .Properties(
            ps => ps
                .Completion(c => c.Name("title.completion")
                    .IndexAnalyzer("standard")
                    .SearchAnalyzer("standard")
                    .MaxInputLength(30)
                    .Payloads()
                    .PreserveSeparators()
                    .PreservePositionIncrements())
                .String(s => s.Name(x => x.Title).CopyTo("title.completion")))));

Change your SuggestCompletion to

var result = client.Search<ContentIndexable>(body => body
    .Index(indexName)
    .SuggestCompletion("content-suggest" + Guid.NewGuid(),
        descriptor => descriptor
            .OnField(t => t.Title.Suffix("completion"))
            .Text("title")
            .Size(10)));

and QueryString to

var searchResponse = client.Search<ContentIndexable>(body => body
    .Index(indexName)
    .Query(query => query
        .QueryString(
            qs => qs
                .OnFields(f => f.Title.Suffix("completion"))
                .Query("title tha" + "*")
                .MinimumShouldMatchPercentage(100))));

Problem with this solution is fact that we are storing data twice for Title field. This is why I mentioned earlier that would be great to use MultiField but I wasn't able to do this with the NEST.

Hope this helps.

这篇关于带有空格的Elasticsearch Nest通配符查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆