如何使Lucene QueryParser更宽容? [英] How to make the Lucene QueryParser more forgiving?

查看:81
本文介绍了如何使Lucene QueryParser更宽容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Lucene.net,但是我将这个问题标记为.NET和Java版本,因为API相同,并且我希望两个平台上都有解决方案.

I'm using Lucene.net, but I am tagging this question for both .NET and Java versions because the API is the same and I'm hoping there are solutions on both platforms.

我确定其他人已经解决了这个问题,但是我找不到任何好的讨论或示例.

I'm sure other people have addressed this issue, but I haven't been able to find any good discussions or examples.

默认情况下,Lucene对查询语法非常挑剔.例如,我刚遇到以下错误:

By default, Lucene is very picky about query syntax. For example, I just got the following error:

[ParseException: Cannot parse 'hi there!': Encountered "<EOF>" at line 1, column 9.
Was expecting one of:
    "(" ...
    "*" ...
    <QUOTED> ...
    <TERM> ...
    <PREFIXTERM> ...
    <WILDTERM> ...
    "[" ...
    "{" ...
    <NUMBER> ...
    ]
   Lucene.Net.QueryParsers.QueryParser.Parse(String query) +239

在处理来自用户的查询时,防止ParseExceptions的最佳方法是什么?在我看来,最可用的搜索界面是始终执行查询的界面,即使它可能是错误的查询也是如此.

What is the best way to prevent ParseExceptions when processing queries from users? It seems to me that the most usable search interface is one that always executes a query, even if it might be the wrong query.

似乎有一些可行且互补的策略:

It seems that there are a few possible, and complementary, strategies:

  • 在将查询发送到QueryProcessor之前先对其进行清理"
  • 优雅地处理异常
    • 向用户显示智能错误消息
    • 也许执行一个更简单的查询,省去了错误的位
    • "Clean" the query prior to sending it to the QueryProcessor
    • Handle exceptions gracefully
      • Show an intelligent error message to the user
      • Perhaps execute a simpler query, leaving off the erroneous bit

      关于如何执行这些策略,我真的没有什么好主意.还有其他人解决过这个问题吗?有我不知道的简单"或优美"解析器吗?

      I don't really have any great ideas about how to do any of those strategies. Has anyone else addressed this issue? Are there any "simple" or "graceful" parsers that I don't know about?

      推荐答案

      Yo可以使Lucene忽略特殊字符,方法是使用类似

      Yo can make Lucene ignore the special characters by sanitizing the query with something like

      query = QueryParser.Escape(query)
      

      如果您不希望用户在其查询中使用高级语法,则可以始终这样做.

      If you do not want your users to ever use advanced syntax in their queries, you can do this always.

      如果您希望用户使用高级语法,但也希望对错误更加宽容,则应该在发生ParseException后才进行清理.

      If you want your users to use advanced syntax but you also want to be more forgiving with the mistakes you should only sanitize after a ParseException has occured.

      这篇关于如何使Lucene QueryParser更宽容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆