Lucene通配符匹配在化学符号上失败(?) [英] Lucene wildcard matching fails on chemical notations(?)

查看：104 发布时间：2020/5/4 7:50:07 java lucene wildcard matching hibernate-search

本文介绍了Lucene通配符匹配在化学符号上失败(?)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用休眠搜索注释(大多只是@Field(index = Index.TOKENIZED))我已经索引了许多与我的持久类(化合物)相关的字段.我已经使用

Using Hibernate Search Annotations (mostly just @Field(index = Index.TOKENIZED)) I've indexed a number of fields related to a persisted class of mine called Compound. I've setup text search over all the indexed fields, using the MultiFieldQueryParser, which has so far worked fine.

在可索引的字段和可搜索的字段中有一个名为compoundName的字段，其中包含示例值:

Among the fields indexed and searchable is a field called compoundName, with sample values:

3-Hydroxyflavone
6,4'-Dihydroxyflavone

3-Hydroxyflavone
6,4'-Dihydroxyflavone

当我完全搜索这些值中的任何一个时，将返回相关的Compound实例.但是，当我使用部分名称并引入通配符时，会出现问题:

When I search for either of these values in full the related Compound instances are returned. However problems occur when I use the partial name and introduce wildcards:

搜索3-Hydroxyflav*仍会给出正确的匹配，但是
搜索6,4'-Dihydroxyflav*找不到任何内容.

searching for 3-Hydroxyflav* still gives the correct hit, but
searching for 6,4'-Dihydroxyflav* fails to find anything.

现在，由于我是Lucene/Hibernate-search的新手，所以我不确定在哪里看待这一点..我认为这可能与第二个查询中出现的'有关，但我不知道如何进行.我应该研究Tokenizers/Analyzers/QueryParsers还是其他东西?

Now as I'm quite new to Lucene / Hibernate-search, I'm not quite sure where to look at this point.. I think it might have something to do with the ' present in the second query, but I don't know how to proceed.. Should I look into Tokenizers / Analyzers / QueryParsers or something else entirely?

或者有人可以告诉我如何才能进行第二个通配符搜索匹配，最好不要破坏MultiField-search行为吗?

Or can anyone tell me how I can get the second wildcard search to match, preferably without breaking the MultiField-search behavior?

我正在使用Hibernate-Search 3.1.0.GA& Lucene核心2.9.3.

I'm using Hibernate-Search 3.1.0.GA & Lucene-core 2.9.3.

一些相关的代码位来说明我当前的方法:

Some relevant code bits to illustrate my current approach:

已索引的Composite类的相关部分:

Relevant parts of the indexed Compound class:

@Entity
@Indexed
@Data
@EqualsAndHashCode(callSuper = false, of = { "inchikey" })
public class Compound extends DomainObject {
    @NaturalId
    @NotEmpty
    @Length(max = 30)
    @Field(index = Index.TOKENIZED)
    private String                  inchikey;

    @ManyToOne
    @IndexedEmbedded
    private ChemicalClass           chemicalClass;

    @Field(index = Index.TOKENIZED)
    private String                  commonName;
...
}

我当前如何搜索被索引的字段:

How I currently search over the indexed fields:

String[] searchfields = Compound.getSearchfields();
MultiFieldQueryParser parser = 
    new MultiFieldQueryParser(Version.LUCENE_29, searchfields, new StandardAnalyzer(Version.LUCENE_29));
FullTextSession fullTextSession = Search.getFullTextSession(getSession());
FullTextQuery fullTextQuery = 
    fullTextSession.createFullTextQuery(parser.parse("searchterms"), Compound.class);
List<Compound> hits = fullTextQuery.list();

Lucene通配符匹配在化学符号上失败(?) [英] Lucene wildcard matching fails on chemical notations(?)

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Lucene通配符匹配在化学符号上失败(?) [英] Lucene wildcard matching fails on chemical notations(?)

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭