Hibernate搜索以查找短语的部分匹配 [英] Hibernate search to find partial matches of a phrase

查看：170 发布时间：2018/6/11 14:54:46 java hibernate lucene hibernate-search solar

本文介绍了Hibernate搜索以查找短语的部分匹配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在我的项目中，我们使用hibernate搜索4.5与lucene分析器和太阳能。
我为我的客户提供了一个文本字段。当他们输入一个短语时，我想找到所有名字都包含给定短语的 User 实体。

例如考虑在数据库中有以下标题的条目列表：

  [Alan Smith，John Cane，Juno Taylor，Tom Caner Junior]

jun 应该返回 Juno Taylor 和 Tom Caner Junior

an 应该返回 Alan Smith ， John Cane 和 Tom Caner Junior

  @AnalyzerDef（name =customanalyzer，tokenizer = @TokenizerDef（factory = WhitespaceTokenizerFactory.class） ，filters = {
 @TokenFilterDef（factory = LowerCaseFilterFactory.class），
 @TokenFilterDef（factory = SnowballPorterFilterFactory.class，params = {@Parameter（name =language，value =English）} ）
 
}）
 @Analyzer（definition =customanalyzer）
 public c lass Student实现了Serializable {
 
 @Column（name =Fname）
 @Field（index = Index.YES，store = Store.YES，analyze = Analyze.YES）
私人字符串fname; 
 
 @Column（name =Lname）
 @Field（index = Index.YES，store = Store.YES，analyze = Analyze.YES）
 private String lname; 
 
}

我尝试了通配符搜索，但是

 
 
  通配符查询不会将分析仪应用于匹配条件。否则的风险*或？  
 查询luceneQuery = mythQB 
 .keyword（）
 .wildcard（）
 .onFields（fname）
 .matching（ju *）
 .createQuery（）; 
  
我该如何做到这一点？ 
 
解决方案
首先，您没有将分析仪分配到您的领域，所以目前没有使用。您应该使用@ Field.analyzer。
第二，要回答您的问题，最好使用 EdgeNGramFilter 。您应该将此筛选器添加到您的分析器定义中。
 
 
  编辑：另外，为避免sathya等查询匹配sanchana实例中，您应该在查询时使用不同的分析器。
 
 
 下面是一个完整的示例。 
 
 
  @AnalyzerDef（name =customanalyzer，tokenizer = @TokenizerDef（factory = WhitespaceTokenizerFactory.class），filters = {
 @TokenFilterDef（factory = LowerCaseFilterFactory.class），
 @TokenFilterDef （factory = SnowballPorterFilterFactory.class，params = {@Parameter（name =language，value =English）}）
 @TokenFilterDef（factory = EdgeNGramFilterFactory.class，params = {@Parameter（name =maxGramSize ，value =15）}）
 
}）
 @AnalyzerDef（name =customanalyzer_query，tokenizer = @TokenizerDef（factory = WhitespaceTokenizerFactory.class），filters = {
 @TokenFilterDef（factory = LowerCaseFilterFactory.class），
 @TokenFilterDef（facto ry = SnowballPorterFilterFactory.class，params = {@Parameter（name =language，value =English）}）
 
}）
 public class Student实现Serializable {
 
 @Column（name =Fname）
 @Field（index = Index.YES，store = Store.YES，analyze = Analyze.YES，analyzer = @Analyzer（definition =customanalyzer）） 
 private String fname; 
 
 @Column（name =Lname）
 @Field（index = Index.YES，store = Store.YES，analyze = Analyze.YES，analyzer = @Analyzer（definition = customanalyzer）））
 private String lname; 
 
} 
  
然后特别提一下你想用这个查询构建查询时的分析器：
 $ b $ pre $  QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory（）。buildQueryBuilder（）。forEntity（Student.class） 
 //这里是query分析器的赋值
 .overridesForField（fname，customanalyzer_query）
 .overridesForField（lname，customanalyzer_query）
。得到（）; 
 //然后像往常一样工作
查询luceneQuery = qb.keyword（）。onFields（fname，lname）。matching（sathya）。createQuery（）; 
 FullTextQuery query = fullTextEntityManager.createFullTextQuery（luceneQuery，Student.class）; 
  
另请参阅： https://stackoverflow.com/a/43047342/6692043  
 
 
 
 
 
 顺便说一下，如果你的数据只包括名字和姓氏，你不应该使用词干（ SnowballPorterFilterFactory ）：它只会使搜索不准确，因为没有理由。
 
In my project we are using hibernate search 4.5 with lucene-analyzers and solar. 
I provide a text field to my clients. When they type in a phrase I would like to find all User entities whose names include the given phrase.

For example consider having list of entries in database with following titles:
[ Alan Smith, John Cane, Juno Taylor, Tom Caner Junior ]
jun should return Juno Taylor and Tom Caner Junior

an should return Alan Smith, John Cane and Tom Caner Junior
    @AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
            @TokenFilterDef(factory = LowerCaseFilterFactory.class),
            @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })

    })
@Analyzer(definition = "customanalyzer")
    public class Student implements Serializable {

        @Column(name = "Fname")
        @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
        private String fname;

        @Column(name = "Lname")
        @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES)
        private String lname;

    }
I have tried with wildcard search but 

Wildcard queries do not apply the analyzer on the matching terms. Otherwise the risk of * or ? being mangled is too high.
Query luceneQuery = mythQB
    .keyword()
      .wildcard()
    .onFields("fname")
    .matching("ju*")
    .createQuery();
How can I achieve this?
 解决方案 
First, you didn't assign the analyzer to your field, so it isn't used currently. You should use @Field.analyzer.

Second, to answer your question, this kind of text is best analyzed with an EdgeNGramFilter. You should add this filter to your analyzer definition.

EDIT: Also, to avoid queries such as "sathya" to match "sanchana" for instance, you should use a different analyzer when querying.

Below is a full example.
@AnalyzerDef(name = "customanalyzer", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
        @TokenFilterDef(factory = LowerCaseFilterFactory.class),
        @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })
        @TokenFilterDef(factory = EdgeNGramFilterFactory.class, params = { @Parameter(name = "maxGramSize", value = "15") })

})
@AnalyzerDef(name = "customanalyzer_query", tokenizer = @TokenizerDef(factory = WhitespaceTokenizerFactory.class), filters = {
        @TokenFilterDef(factory = LowerCaseFilterFactory.class),
        @TokenFilterDef(factory = SnowballPorterFilterFactory.class, params = { @Parameter(name = "language", value = "English") })

})
public class Student implements Serializable {

    @Column(name = "Fname")
    @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer"))
    private String fname;

    @Column(name = "Lname")
    @Field(index = Index.YES, store = Store.YES, analyze = Analyze.YES, analyzer = @Analyzer(definition = "customanalyzer")))
    private String lname;

}
And then specifically mention that you want to use this "query" analyzer when building your query:
QueryBuilder queryBuilder = fullTextEntityManager.getSearchFactory().buildQueryBuilder().forEntity(Student.class)
    // Here come the assignments of "query" analyzers
    .overridesForField( "fname", "customanalyzer_query" )
    .overridesForField( "lname", "customanalyzer_query" )
    .get();
// Then it's business as usual
Query luceneQuery = qb.keyword().onFields("fname", "lname").matching("sathya").createQuery();
FullTextQuery query = fullTextEntityManager.createFullTextQuery(luceneQuery, Student.class);
See also: https://stackoverflow.com/a/43047342/6692043



By the way, if your data includes only first and last names, you shouldn't use stemming (SnowballPorterFilterFactory): it will only make the search less accurate for no good reason.

                        这篇关于Hibernate搜索以查找短语的部分匹配的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

Hibernate搜索以查找短语的部分匹配 [英] Hibernate search to find partial matches of a phrase

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Hibernate搜索以查找短语的部分匹配 [英] Hibernate search to find partial matches of a phrase

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭