Lucene中的多字段查询处理 [英] Multiple Field Query handling in Lucene

查看:85
本文介绍了Lucene中的多字段查询处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Lucene中编写了一个索引搜索器,它将在索引数据库中搜索多个字段.

I have written an index searcher in Lucene that will search multiple fields in the indexed database.

实际上,查询需要两个字符串,一个是title,另一个是cityname.

Actually it takes query as two strings one is say title and another is cityname.

现在,索引数据库具有三个字段:title, address and city.

Now the indexed database has three field: title, address and city.

仅当标题匹配和城市名称匹配时,命中才会发生.为此,我在帖子的帮助下使用MultiFieldQuerySearcher编写了以下搜索器代码:

Hit should occur only if the title matches and city name matches. For that purpose I have written the following searcher code using MultiFieldQuerySearcher with the help of a post:

public void searchdb(String myQuery, String myCity) throws Exception
{
    System.out.println("Searching in the database ...");
    String[] fields={"title","address","city"};
    MultiFieldQueryParser parser = new MultiFieldQueryParser(Version.LUCENE_CURRENT, fields, new StandardAnalyzer(Version.LUCENE_CURRENT));
    parser.setDefaultOperator(QueryParser.Operator.AND);
    if(!myQuery.toLowerCase().contains(myCity.toLowerCase()))
    {
        myQuery="title:"+myQuery+" "+"address:"+myQuery+" "+myCity+" "+"city:"+myCity;
    }
    Query query=parser.parse(myQuery);
    if (query instanceof BooleanQuery) 
    {
        BooleanClause.Occur[] flags ={BooleanClause.Occur.MUST,BooleanClause.Occur.SHOULD,BooleanClause.Occur.MUST};
        BooleanQuery booleanQuery = (BooleanQuery) query;
        BooleanClause[] clauses = booleanQuery.getClauses();
        System.out.println("Query="+booleanQuery.toString()+" and Number of clauses="+clauses.length);
        for (int i = 0; i < clauses.length; i++) 
        {
            clauses[i].setOccur(flags[i]);
        }
        Directory dir=FSDirectory.open(new File("demoIndex"));
        IndexSearcher searcher = new IndexSearcher(dir, true);
        TopDocs hits = searcher.search(booleanQuery, 20);
        searcher.close();
        dir.close();
        System.out.println("Number of hits="+hits.totalHits);
    }
}

但是它不能正常运行.

例如,如果查询为"Pizza Hut"而城市为"Mumbai",我希望仅在数据库的标题字段中搜索"Pizza Hut",而仅在数据库的city字段中搜索孟买".

For example if the query is "Pizza Hut" and city is "Mumbai", I want "Pizza Hut" to be searched only in title field of the database and Mumbai only in city field of the database.

但是随着语句 booleanQuery.toString()输出为" + title:pizza +(title :hut city:hut)+ city:mumbai ".

But it is finding "Hut" also in the city field of the database as the output of the statement booleanQuery.toString() is coming as "+title:pizza +(title:hut city:hut) +city:mumbai".

作为for循环的结果,它给出了索引outOfBound错误.

As a result in the for loop it is giving index outOfBound error.

我是Lucene的新手.因此,我正在寻求帮助来解决问题.

I am new to Lucene. So I am asking for help to fix the problem.

推荐答案

仅当我们要在多个字段中搜索相同的关键字时,才使用MultiFieldQueryParser.

We use MultiFieldQueryParser only when we want to search the same keyword(s) in multiple fields.

要处理用例,更简单的是,您已经分别引用了city-keyword和title-keyword.尝试使用以下代码.

To handle your use case, it is simpler that you already have references to city-keyword and title-keyword separately. Try using following code.

StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
// city query
QueryParser cityQP = new QueryParser(Version.LUCENE_CURRENT, "city", analyzer);
Query cityQuery = cityQP.parse(myCity);

// title query
QueryParser titleQP = new QueryParser(Version.LUCENE_CURRENT, "title", analyzer);
Query titleQuery = titleQP.parse(myQuery);

// final query
BooleanQuery finalQuery = new BooleanQuery();
finalQuery.add(cityQuery, Occur.MUST); // MUST implies that the keyword must occur.
finalQuery.add(titleQuery, Occur.MUST); // Using all "MUST" occurs is equivalent to "AND" operator.

这篇关于Lucene中的多字段查询处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆