Azure搜索中的同义词映射,同义词短语 [英] Synonym Maps in Azure Search, synonym phrases

查看:156
本文介绍了Azure搜索中的同义词映射,同义词短语的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Azure搜索中使用同义词映射,但是遇到了问题.我想将多个单词和短语映射到一个搜索查询中.

I'm trying to use synonym maps in Azure Search and i'm running into a problem. I want to have several words and phrases map into a single search query.

换句话说,当我搜索其中之一时:

In other words, when i search for either:

产品123 product0123 产品0123

我希望搜索返回查询短语的结果:

i want the search to return results for a query phrase:

product123 .

在阅读教程之后,一切似乎都非常简单.

After reading the tutorial it all seemed pretty straight forward.

我正在使用.Net Azure.Search SDK 5.0,所以我已经完成了以下操作:

I'm using .Net Azure.Search SDK 5.0 so i've done the following:

var synonymMap = new SynonymMap
{
     Name = "test-map",
     Format = SynonymMapFormat.Solr,
     Synonyms = "product 123, product0123, product 0123=>product123\n"
};  
_searchClient.SynonymMaps.CreateOrUpdate(synonymMap);

我在搜索字段之一上使用地图

and i use the map on one of the search fields

index.Fields.First(x => x.Name == "Title").SynonymMaps = new[] {"test-map"};

到目前为止,一切都很好.现在,如果我搜索 product0123 ,我将得到与 product123 相同的结果.但是,如果我搜索词组 product 123 product 0123 ,我会得到一系列无关的结果.这几乎就像同义词映射不适用于多词项.

So far so good. Now if i do a search for product0123 i get results for product123 as i would expect. But if i search for a phrase product 123 or product 0123 i get bunch of irrelevant results. It's almost as if the synonym maps do not work with multi word items.

所以我想我的问题是,我是错误地使用了同义词映射图还是这些映射仅适用于单个单词的同义词?

So guess my question is, am i using synonym maps incorrectly or these maps only work with single word synonyms?

推荐答案

短语product 123product 0123是否用双引号引起来?短语必须用双引号("product 123").双引号是用于短语搜索的运算符,对于同义词来说,双引号可以确保对短语中的术语进行分析并与同义词图中作为短语的规则相匹配.没有它,查询解析器会将未加引号的短语分隔为单个词,然后尝试对单个词进行同义词匹配.在这种情况下,查询变为product OR 123.

Are the phrases, product 123 or product 0123, in double quotes? It is required for the phrases to be in double quotes ("product 123"). Double quotes are the operators for phrase search and in the case for synonyms, they ensure that the terms in the phrase are analyzed and matched against the rules in the synonym map as a phrase. Without it, query parser separates the unquoted phrase to individual terms and tries synonym matching on individual terms. The query becomes product OR 123 in that case.

本文档解释了如何查询进行解析(第1阶段)并进行分析(第2阶段).同义词的应用在第二阶段完成.

This documentation explains how queries are parsed (stage 1) and analyzed (stage 2). The application of synonyms in done in the second stage.

要回答评论中的第二个问题,不幸的是,必须使用双引号来匹配多词同义词.但是,作为应用程序开发人员,您可以完全控制传递给搜索服务的内容.例如,给定来自用户的查询product 123,您可以在后台重新编写查询以提高准确性,并在 传递给搜索服务之前对其进行重新调用.短语搜索或邻近搜索可用于提高精度,而通配符(例如模糊或前缀搜索)可用于提高查询的查全率.您将查询product 123重写为类似"product 123"~10 product 123的名称,并且同义词将应用于查询的短语部分.

To answer your second question in the comment, unfortunately double quotes are required to match multi word synonyms. However, as an application developer, you have the full control of what gets passed to the search service. For example, given a query product 123 from the user, you can re-write the query under the hood to improve precision and recall before it gets passed to the search service. Phrasing or proximity searches can be used to improve precision and wildcard (such as fuzzy or prefix searches) can be used to improve recall of the query. You would rewrite the query product 123 to something like "product 123"~10 product 123 and synonyms will apply to the phrased part of the query.

内特

这篇关于Azure搜索中的同义词映射,同义词短语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆