确定句子是否是查询 [英] Determine if a sentence is an inquiry
问题描述
如何检测搜索查询是否为问题形式?
How can I detect if a search query is in the form of a question?
例如,客户可能搜索如何跟踪我的订单"(注意,没有问号).
For example, a customer might search for "how do I track my order" (notice no question mark).
我猜大多数直接的问题都符合特定的语法.
I'm guessing most direct questions would conform to a particular grammar.
非常简单的猜测方法:
START WORDS = [who, what, when, where, why, how, is, can, does, do]
isQuestion(sentence):
sentence ends with '?'
OR sentence starts with one of START WORDS
START WORDS列表可能更长.范围是一个网站搜索框,因此我认为列表中不需要包含太多单词.
START WORDS list could be longer. The scope is a website search box, so I imagine the list shouldn't need to include too many words.
是否有一个库可以比我的简单猜测方法做得更好?我的方法有什么改进吗?
Is there a library that can do this better than my simple guessing approach? Any improvements on my approach?
推荐答案
另请参见:在问题的句法解析中(通过nltk之类的工具包获得),正确的结构应为:
In a syntactic parse of a question (obtained through a toolkit like nltk), the correct structure will be in the form of:
(SBARQ (WH+ (W+) ...)
(SQ ...*
(V+) ...*)
(?))
因此,使用任何可用的语法解析器,带有SBARQ节点的树(带有可选的嵌入式SQ)将成为指示输入为问题的指示符. WH +节点(WHNP/WHADVP/WHADJP)包含问题词干(谁/什么/何时/何地/为什么/如何),而SQ则包含倒置短语.
So, using anyone of the syntactic parsers available, a tree with an SBARQ node having an embedded SQ (optionally) will be an indicator the input is a question. The WH+ node (WHNP/WHADVP/WHADJP) contains the question stem (who/what/when/where/why/how) and the SQ holds the inverted phrase.
即:
(SBARQ
(WHNP
(WP What))
(SQ
(VBZ is)
(NP
(DT the)
(NN question)))
(. ?))
当然,拥有很多前置子句会导致解析错误(可以解决),以及确实写得不好的问题.例如,该帖子的标题如何找出一个句子是否是一个问题?"将具有SBARQ,但没有SQ.
Of course, having a lot of preceeding clauses will cause errors in the parse (that can be worked around), as will really poorly-written questions. For example, the title of this post "How to find out if a sentence is a question?" will have an SBARQ, but not an SQ.
这篇关于确定句子是否是查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!