确定一个句子是否是查询 [英] Determine if a sentence is an inquiry
问题描述
如何检测搜索查询是否采用问题形式?
How can I detect if a search query is in the form of a question?
例如,客户可能会搜索如何跟踪我的订单"(注意不要有问号).
For example, a customer might search for "how do I track my order" (notice no question mark).
我猜最直接的问题会符合特定的语法.
I'm guessing most direct questions would conform to a particular grammar.
非常简单的猜测方法:
START WORDS = [who, what, when, where, why, how, is, can, does, do]
isQuestion(sentence):
sentence ends with '?'
OR sentence starts with one of START WORDS
START WORDS 列表可能更长.范围是一个网站搜索框,所以我想这个列表应该不需要包含太多的词.
START WORDS list could be longer. The scope is a website search box, so I imagine the list shouldn't need to include too many words.
有没有比我的简单猜测方法更好的库?我的方法有什么改进吗?
Is there a library that can do this better than my simple guessing approach? Any improvements on my approach?
推荐答案
在一个问题的句法解析中(通过像 nltk 这样的工具包获得),正确的结构将采用以下形式:
In a syntactic parse of a question (obtained through a toolkit like nltk), the correct structure will be in the form of:
(SBARQ (WH+ (W+) ...)
(SQ ...*
(V+) ...*)
(?))
因此,使用任何可用的句法解析器,带有嵌入 SQ(可选)的 SBARQ 节点的树将指示输入是问题.WH+ 节点 (WHNP/WHADVP/WHADJP) 包含问题词干(who/what/when/where/why/how),SQ 包含倒置短语.
So, using anyone of the syntactic parsers available, a tree with an SBARQ node having an embedded SQ (optionally) will be an indicator the input is a question. The WH+ node (WHNP/WHADVP/WHADJP) contains the question stem (who/what/when/where/why/how) and the SQ holds the inverted phrase.
即:
(SBARQ
(WHNP
(WP What))
(SQ
(VBZ is)
(NP
(DT the)
(NN question)))
(. ?))
当然,有很多前面的子句会导致解析错误(可以解决),写得不好的问题也会如此.例如,这篇文章的标题如何判断一个句子是否是一个问题?"将有一个 SBARQ,但没有一个 SQ.
Of course, having a lot of preceeding clauses will cause errors in the parse (that can be worked around), as will really poorly-written questions. For example, the title of this post "How to find out if a sentence is a question?" will have an SBARQ, but not an SQ.
这篇关于确定一个句子是否是查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!