Solr和Hibernate Search的多字同义词 [英] Multiword synonyms with Solr and Hibernate Search

查看:73
本文介绍了Solr和Hibernate Search的多字同义词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个同义词.txt文件,其内容如下:

I have a synonyms.txt file with content as below

car accessories, gadi marmat

并且我正在为汽车配件编制索引,以便将其扩展到汽车配件加迪马马特.

and I am indexing car accessories as a single token so that it will expand to car accessories and gadi marmat.

我希望整个同义词匹配,以便在查询 gadi marmat 时,返回带有汽车配件的记录.

i want the whole synonyms to match so that when query for gadi marmat, the record with car accessories to be returned.

我正在使用带状滤波器工厂来扩展查询,以便在搜索 gadi marmat 时将其扩展为 gadi gadi marmat marmat ,并且由于 gadi marmat 是作为单个令牌查询的,因此它应该与汽车配件相匹配并返回结果,但事实并非如此,但是当我搜索汽车配件时,它正在返回结果.因此,必须对带有多个单词的同义词进行索引的prblm.

I am using shingle filter factory to expand query so that when searching for gadi marmat, it will be expanded to gadi, gadi marmat and marmat, and since gadi marmat is queried as a single token, it should have matched car accessories and returned result but this is not the case, but when i search for car accessories, it is returning result. So must be prblm with indexing synonyms that have multiple words.

请提出建议.

推荐答案

同义词文件仅用于更改您要搜索的单词.所以如果你写

synonym file is use only to change a word that are you searching. so if you write

汽车配件=> gadi marmat

car accessories => gadi marmat

当编译器匹配汽车配件"时,它将尝试匹配"gadi marmat"

when a compiler matching on "car accessories", it try to matching on "gadi marmat"

它像单个令牌一样工作

混合这样的分析器元素可以获得良好的结果

you can get good results mixing analyzer elements like that

@AnalyzerDef(name = "integram",
tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
     @TokenFilterDef(factory = LowerCaseFilterFactory.class),
     @TokenFilterDef(factory = StopFilterFactory.class, params = {
         @Parameter(name = "words", value = "lucene/dictionary/stopwords.txt"),
         @Parameter(name = "ignoreCase", value = "true"),
         @Parameter(name = "enablePositionIncrements", value = "true")
     }),
     @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
            @Parameter(name = "language", value = "English")
     }),
     @TokenFilterDef(factory = SynonymFilterFactory.class, params = {
         @Parameter(name = "synonyms", value = "lucene/dictionary/synonyms.txt"),
         @Parameter(name = "expand", value = "false")
     }),
     @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = {
            @Parameter(name = "language", value = "English")
     })
})

这篇关于Solr和Hibernate Search的多字同义词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆