Hibernate搜索以查找短语的部分匹配 [英] Hibernate search to find partial matches of a phrase
问题描述
在我的项目中,我们使用hibernate搜索4.5与lucene分析器和太阳能。
我为我的客户提供了一个文本字段。当他们输入一个短语时,我想找到所有名字都包含给定短语的 User
实体。
例如考虑在数据库中有以下标题的条目列表:
[Alan Smith,John Cane,Juno Taylor,Tom Caner Junior]
jun
应该返回 Juno Taylor
和 Tom Caner Junior
an
应该返回 Alan Smith ,
John Cane
和 Tom Caner Junior
@AnalyzerDef(name =customanalyzer,tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class) ,filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class,params = {@Parameter(name =language,value =English)} )
})
@Analyzer(definition =customanalyzer)
public c lass Student实现了Serializable {
@Column(name =Fname)
@Field(index = Index.YES,store = Store.YES,analyze = Analyze.YES)
私人字符串fname;
@Column(name =Lname)
@Field(index = Index.YES,store = Store.YES,analyze = Analyze.YES)
private String lname;
}
我尝试了通配符搜索,但是
查询luceneQuery = mythQB
.keyword()
.wildcard()
.onFields(fname)
.matching(ju *)
.createQuery();
我该如何做到这一点?
首先,您没有将分析仪分配到您的领域,所以目前没有使用。您应该使用@ Field.analyzer。
第二,要回答您的问题,最好使用 EdgeNGramFilter $ c来分析此类文本$ C>。您应该将此筛选器添加到您的分析器定义中。
编辑:另外,为避免sathya等查询匹配sanchana实例中,您应该在查询时使用不同的分析器。
下面是一个完整的示例。
@AnalyzerDef(name =customanalyzer,tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class),filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef (factory = SnowballPorterFilterFactory.class,params = {@Parameter(name =language,value =English)})
@TokenFilterDef(factory = EdgeNGramFilterFactory.class,params = {@Parameter(name =maxGramSize ,value =15)})
})
@AnalyzerDef(name =customanalyzer_query,tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class),filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(facto ry = SnowballPorterFilterFactory.class,params = {@Parameter(name =language,value =English)})
})
public class Student实现Serializable {
@Column(name =Fname)
@Field(index = Index.YES,store = Store.YES,analyze = Analyze.YES,analyzer = @Analyzer(definition =customanalyzer))
private String fname;
@Column(name =Lname)
@Field(index = Index.YES,store = Store.YES,analyze = Analyze.YES,analyzer = @Analyzer(definition = customanalyzer)))
private String lname;
}
然后特别提一下你想用这个查询构建查询时的分析器:
$ b $ pre $ QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory()。buildQueryBuilder()。forEntity(Student.class)
//这里是query分析器的赋值
.overridesForField(fname,customanalyzer_query)
.overridesForField(lname,customanalyzer_query)
。得到();
//然后像往常一样工作
查询luceneQuery = qb.keyword()。onFields(fname,lname)。matching(sathya)。createQuery();
FullTextQuery query = fullTextEntityManager.createFullTextQuery(luceneQuery,Student.class);
另请参阅: https://stackoverflow.com/a/43047342/6692043
顺便说一下,如果你的数据只包括名字和姓氏,你不应该使用词干( SnowballPorterFilterFactory
):它只会使搜索不准确,因为没有理由。
In my project we are using hibernate search 4.5 with lucene-analyzers and solar.
I provide a text field to my clients. When they type in a phrase I would like to find all User
entities whose names include the given phrase.
For example consider having list of entries in database with following titles:
[ Alan Smith, John Cane, Juno Taylor, Tom Caner Junior ]
jun
should return Juno Taylor
and Tom Caner Junior
an
should return Alan Smith
, John Cane
and Tom Caner Junior
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
})
@Analyzer(definition = "customanalyzer")
public class Student implements Serializable {
@Column(name = "Fname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
private String fname;
@Column(name = "Lname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
private String lname;
}
I have tried with wildcard search but
Query luceneQuery = mythQB
.keyword()
.wildcard()
.onFields("fname")
.matching("ju*")
.createQuery();
How can I achieve this?
解决方案 First, you didn't assign the analyzer to your field, so it isn't used currently. You should use @Field.analyzer.
Second, to answer your question, this kind of text is best analyzed with an EdgeNGramFilter
. You should add this filter to your analyzer definition.
EDIT: Also, to avoid queries such as "sathya" to match "sanchana" for instance, you should use a different analyzer when querying.
Below is a full example.
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
@TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "15") })
})
@AnalyzerDef(name = "customanalyzer_query", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
@TokenFilterDef(factory = LowerCaseFilterFactory.class),
@TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
})
public class Student implements Serializable {
@Column(name = "Fname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer"))
private String fname;
@Column(name = "Lname")
@Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer")))
private String lname;
}
And then specifically mention that you want to use this "query" analyzer when building your query:
QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Student.class)
// Here come the assignments of "query" analyzers
.overridesForField( "fname", "customanalyzer_query" )
.overridesForField( "lname", "customanalyzer_query" )
.get();
// Then it's business as usual
Query luceneQuery = qb.keyword().onFields("fname", "lname").matching("sathya").createQuery();
FullTextQuery query = fullTextEntityManager.createFullTextQuery(luceneQuery, Student.class);
See also: https://stackoverflow.com/a/43047342/6692043
By the way, if your data includes only first and last names, you shouldn't use stemming (SnowballPorterFilterFactory
): it will only make the search less accurate for no good reason.
这篇关于Hibernate搜索以查找短语的部分匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!