匹配lucene整个字段的精确值 [英] Match lucene entire field exact value

查看:70
本文介绍了匹配lucene整个字段的精确值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建Lucene 4.10.3索引.

I'm creating a Lucene 4.10.3 index.

我正在使用他的StandardAnalyzer.

I am using he StandardAnalyzer.

    String indexpath="C:\\TEMP";
    IndexWriterConfig iwc=newIndexWriterConfig(Version.LUCENE_4_10_3,new StandardAnalyzer(CharArraySet.EMPTY_SET)); 
    Directory dir = FSDirectory.open(new File(indexpath));          
    IndexWriter indexWriter = new IndexWriter(dir, iwc);
    iwc.setOpenMode(IndexWriterConfig.OpenMode.CREATE_OR_APPEND);   
    Document doc = new Document();
    doc.add(new TextField("city", "ANDHRA",Store.YES));
    doc.add(new TextField("city", "ANDHRA PRADESH",Store.YES));
    doc.add(new TextField("city", "ASSAM AND NAGALAND",Store.YES));
    doc.add(new TextField("city", "ASSAM",Store.YES));
    doc.add(new TextField("city", "PUNJAB",Store.YES));
    doc.add(new TextField("city", "PUNJAB AND HARYANA",Store.YES));
    indexWriter.addDocument(doc);

当我尝试使用短语查询来搜索Lucene索引时

when I try to search in lucene index using phrase query

例如

 try {
        QueryBuilder build=new QueryBuilder(new KeywordAnalyzer());
        Query q1=build.createPhraseQuery("city","ANDHRA");      
        Directory dir = FSDirectory.open(new File("C:\\TEMP"));
        DirectoryReader indexReader = DirectoryReader.open(dir);    
        IndexSearcher searcher = new IndexSearcher(indexReader);
        ScoreDoc hits[] = searcher.search(q1,10).scoreDocs;
        Set<String> set=new HashSet<String>();
        set.add("city");
        for (int i=0; i < hits.length; i++) {
            Document document = indexReader.document(hits[i].doc,set);
            System.out.println(document.get("city"));
        }
     } catch (IOException e) {
        e.printStackTrace();
     }

我们得到的结果如下-

ANDHRA

安德烈·普拉德什

当我搜索"ANDHRA"时,如何仅获得"ANDHRA"结果,不是"ANDHRA PRADESH",如何使用StandardAnalyzer匹配lucene中的整个字段值?

When I am searching for "ANDHRA" how to get only "ANDHRA" result, not "ANDHRA PRADESH", how to match entire field value in lucene by using StandardAnalyzer?

推荐答案

如果要匹配字段的准确,未修改和未标记的值,则根本不应该对其进行分析.只需使用 StringField 而不是 TextField .

If you want to match the exact, unmodified and untokenized, value of the field, you shouldn't be analyzing it at all. Simply use a StringField instead of TextField.

如果您要进行一些分析(例如,使用小写字母或类似的文字),但不进行标记化,则可以使用

If you want some analysis (ie. lowercasing, or some such), but without tokenizing, you can use KeywordTokenizer in your Analyzer implementation for that.

如果要使用 QueryParser 创建查询,请注意解析器如何使用空格分隔查询子句.您可能会发现有必要编写如下查询: city:ANDHRA \ PRADESH (我相信 QueryParser.escape 会为您完成此操作)

If you are using a QueryParser to create your queries, be aware of how the the parser uses spaces to separate query clauses. You may find it necessary to write queries like: city:ANDHRA\ PRADESH (I do not believe QueryParser.escape will do this for you).

这篇关于匹配lucene整个字段的精确值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆