Lucene 中的多字段查询处理 [英] Multiple Field Query handling in Lucene

查看:23
本文介绍了Lucene 中的多字段查询处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Lucene 中编写了一个索引搜索器,它将搜索索引数据库中的多个字段.

I have written an index searcher in Lucene that will search multiple fields in the indexed database.

实际上它将查询作为两个字符串,一个是 title,另一个是 cityname.

Actually it takes query as two strings one is say title and another is cityname.

现在索引数据库有三个字段:title、address 和 city.

Now the indexed database has three field: title, address and city.

只有当标题匹配并且城市名称匹配时才会发生命中.为此,我在帖子的帮助下使用 MultiFieldQuerySearcher 编写了以下搜索器代码:

Hit should occur only if the title matches and city name matches. For that purpose I have written the following searcher code using MultiFieldQuerySearcher with the help of a post:

public void searchdb(String myQuery, String myCity) throws Exception
{
    System.out.println("Searching in the database ...");
    String[] fields={"title","address","city"};
    MultiFieldQueryParser parser = new MultiFieldQueryParser(Version.LUCENE_CURRENT, fields, new StandardAnalyzer(Version.LUCENE_CURRENT));
    parser.setDefaultOperator(QueryParser.Operator.AND);
    if(!myQuery.toLowerCase().contains(myCity.toLowerCase()))
    {
        myQuery="title:"+myQuery+" "+"address:"+myQuery+" "+myCity+" "+"city:"+myCity;
    }
    Query query=parser.parse(myQuery);
    if (query instanceof BooleanQuery) 
    {
        BooleanClause.Occur[] flags ={BooleanClause.Occur.MUST,BooleanClause.Occur.SHOULD,BooleanClause.Occur.MUST};
        BooleanQuery booleanQuery = (BooleanQuery) query;
        BooleanClause[] clauses = booleanQuery.getClauses();
        System.out.println("Query="+booleanQuery.toString()+" and Number of clauses="+clauses.length);
        for (int i = 0; i < clauses.length; i++) 
        {
            clauses[i].setOccur(flags[i]);
        }
        Directory dir=FSDirectory.open(new File("demoIndex"));
        IndexSearcher searcher = new IndexSearcher(dir, true);
        TopDocs hits = searcher.search(booleanQuery, 20);
        searcher.close();
        dir.close();
        System.out.println("Number of hits="+hits.totalHits);
    }
}

但它运行不正常.

例如,如果查询是必胜客",城市是孟买",我希望仅在数据库的标题字段中搜索必胜客",而仅在数据库的城市字段中搜索孟买.

For example if the query is "Pizza Hut" and city is "Mumbai", I want "Pizza Hut" to be searched only in title field of the database and Mumbai only in city field of the database.

但它也在数据库的城市字段中找到小屋",因为语句 booleanQuery.toString() 的输出为+title:pizza +(title:hut city:hut) +city:mumbai".

But it is finding "Hut" also in the city field of the database as the output of the statement booleanQuery.toString() is coming as "+title:pizza +(title:hut city:hut) +city:mumbai".

结果在 for 循环中给出 index outOfBound 错误.

As a result in the for loop it is giving index outOfBound error.

我是 Lucene 的新手.所以我正在寻求帮助来解决这个问题.

I am new to Lucene. So I am asking for help to fix the problem.

推荐答案

只有当我们想在多个字段中搜索相同的关键字时,我们才使用 MultiFieldQueryParser.

We use MultiFieldQueryParser only when we want to search the same keyword(s) in multiple fields.

为了处理您的用例,您已经分别引用了 city-keyword 和 title-keyword 会更简单.尝试使用以下代码.

To handle your use case, it is simpler that you already have references to city-keyword and title-keyword separately. Try using following code.

StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_CURRENT);
// city query
QueryParser cityQP = new QueryParser(Version.LUCENE_CURRENT, "city", analyzer);
Query cityQuery = cityQP.parse(myCity);

// title query
QueryParser titleQP = new QueryParser(Version.LUCENE_CURRENT, "title", analyzer);
Query titleQuery = titleQP.parse(myQuery);

// final query
BooleanQuery finalQuery = new BooleanQuery();
finalQuery.add(cityQuery, Occur.MUST); // MUST implies that the keyword must occur.
finalQuery.add(titleQuery, Occur.MUST); // Using all "MUST" occurs is equivalent to "AND" operator.

这篇关于Lucene 中的多字段查询处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆