带有停用词和searchMode = all的查询没有返回结果 [英] Queries with stopwords and searchMode=all return no results

查看:62
本文介绍了带有停用词和searchMode = all的查询没有返回结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我的文档中的内容是这样的话:

if I have a document with this words in the content:

使用西班牙语分析仪的"dolor de cabeza",搜索"dolor de cabeza"将返回确定文档.但是使用dolor de cabeza(不带引号)不会返回任何结果.

"dolor de cabeza" using the spanish analyzer, searching for "dolor de cabeza" returns the document ok. but using dolor de cabeza (without quotes) returns nothing.

实际上,当使用queryType = Full和searchMode = All时,搜索查询中的每个停用词都将使其不返回任何文档.

Actually, every stop word in the search query will make it to return no documents when using queryType=Full and searchMode=All.

使用引号方法的问题在于,它只会匹配确切的句子.

the problem with using the quote approach is that it will only match the exact sentence.

有什么解决方法吗?我认为这是一个错误.

is there any workaround? I think this is a BUG.

推荐答案

简短版本:

当您对使用使用不同地处理停用词的分析器的字段发出 searchMode = All 的搜索查询时,就会发生这种情况.请确保使用 searchFields 搜索请求参数将查询的范围仅限于使用同一分析仪分析的字段.另外,您可以在所有可搜索字段上设置相同的 searchAnalyzer ,以相同的方式从查询中删除停用词.要了解有关自定义分析器以及如何独立搜索 indexAnalyzer searchAnalyzer 的更多信息,请访问

This happens when you issue a search query with searchMode=All against fields that use analyzers that process stopwords differently. Please make sure you scope your query only to fields analyzed with the same analyzer using the searchFields search request parameter. Alternatively, you can set the same searchAnalyzer on all your searchable fields that removes stopwords from your query in the same way. To learn more about custom analyzers and how to search indexAnalyzer and searchAnalyzer independently, go here.

长版:

让我们用两个字段作为索引,其中一个使用英语Lucene分析仪进行分析,另一个使用标准(默认)分析仪进行分析.

Let’s take an index with two fields where one is analyzed with English Lucene analyzer, and the other with standard (default) analyzer.

{
  "fields":[
    {
      "name":"docId",
      "type":"Edm.String",
      "key":true,
      "searchable":false
    },
    {
      "name":"field1",
      "type":"Edm.String",
      "analyzer":"en.lucene"
    },
    {
      "name":"field2",
      "type":"Edm.String"
    }
  ]
}

让我们添加以下两个文档:

Let’s add these two documents:

{
  "value":[
    {
      "docId":"1",
      "field1":"Waiting for a bus",
      "field2":"Exploring cosmos"
    },
    {
      "docId":"2",
      "field1":"Run to the hills",
      "field2":"run for your life"
    }
  ]
}

以下查询不返回任何结果 search = wait + for& searchMode = all

The following query doesn’t return any results search=wait+for&searchMode=all

这是因为此查询中的术语是由为该字段定义的分析器针对索引中的每个字段独立处理的.对于 field1 ,查询变为 search = wait ("for"已删除,因为它是一个停用词)对于 field2 ,它保持为 search = wait + for (标准分析器不会删除停用词).

It's because terms in this query are processed independently for each of the fields in the index by the analyzer defined for that field. For field1 the query becomes search=wait (‘for’ was removed as it is a stop word) For field2 it stays search=wait+for (the standard analyzer doesn’t remove stop words).

只有第一个文档与"wait"(在第一个字段中)匹配,但是第一个文档中的第二个字段与"for"不匹配,因此没有结果.设置searchMode = all时,您告诉搜索引擎所有查询词必须至少匹配一次.

Only the first document matches ‘wait’ (in the first field), however the second field in the first document doesn’t match ‘for’, thus no results. When you set searchMode=all you tell the search engine that all query terms must be matched at least once.

为了进行比较,另一个带有停用词 search = running + for& searchMode = all 的查询返回第二个文档作为结果.术语运行"在 field1 中匹配(已终止),术语"for"在 field2 中匹配.

For comparison, another query with a stopword search=running+for&searchMode=all returns the second document as a result. Term ‘running’ matches in field1 (it’s stemmed) and ‘for’ matches in field2.

要了解有关Azure搜索中查询处理的更多信息,请阅读全文搜索在Azure搜索中的工作方式

To learn more about query processing in Azure Search read How full text search works in Azure Search

这篇关于带有停用词和searchMode = all的查询没有返回结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆