Lucene不匹配具有高字符的字符串 [英] Lucene is not matching strings having upper characters

查看:280
本文介绍了Lucene不匹配具有高字符的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Lucene搜索引擎(v36),使用StandardAnalyzer。我使用MultiFieldQueryParser。

I am using Lucene Search Engine (v36), with the StandardAnalyzer. I use the MultiFieldQueryParser.

我的一个字段设置为NOT_ANALYZED,因为它是一个包含字母数字字符和点的版本名称。当此字段包含上部字符时,搜索将找不到结果。任何想法?

One of my fields is set as NOT_ANALYZED, because it's a version name containing alphanumeric characters and points. When this field contains an upper character, the search finds no results. Any Idea ?

详细信息:

字段包含如下值:


  • version1.26.12.test.a

  • version1.26.12.test.b

  • v1 .2

  • version1.Dummy

  • version1.26.12.test.a
  • version1.26.12.test.b
  • v1.2
  • version1.Dummy

我的搜索返回上面三个第一个示例的结果,但不是最后一个。

My search is returning results for the three first example above, but not for the last one.

我没有自定义Lucene,除非我绕过了Collections.emptySet()的标准停用词。

I have not customized Lucene at all except that I bypassed the standard stopwords with Collections.emptySet().

非常感谢。
Dimitri

Thanks a lot. Dimitri

推荐答案

我相信如果你将一个字段标记为NOT_ANALYZED, https://lucene.apache.org/core/old_versioned_docs/versions/3_0_1/api/all/org/apache/lucene/analysis/standard/StandardAnalyzer.html =nofollow> StandardAnalyzer 使用LowerCaseFilter (和其他看链接)。因此,如果您搜索version1.Dummy,您的查询字符串可能是version1.dummy,将不匹配存储的字符串。

I believe if you mark a field as NOT_ANALYZED it is stored as is, however StandardAnalyzer uses LowerCaseFilter (and other see link). So if you search for "version1.Dummy", your query string would probably be "version1.dummy" which won't match to the stored string.

这篇关于Lucene不匹配具有高字符的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆