自然语言理解算法 [英] Algorithms for Natural Language Understanding

查看:144
本文介绍了自然语言理解算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道我可以为NLU使用哪些算法?

I wanted to know what algorithms I could use for NLU?

例如,假设我要启动一个程序,并且我有这些句子

For example, let's say I want to start a program, and I have these sentences

让我们开始"

"Let us start"

让他开始"

很明显,第一个句子应该启动程序,但是第二个句子不能启动(因为这没有意义).

Obviously, the first sentence should start the program, but not the second one (since it doesn't make sense).

现在,我正在使用斯坦福大学的NLP API,并实现了TokenRegexAnnotator类:

Right now, I have am using Stanford's NLP API and have implemented the TokenRegexAnnotator class:

CoreMapExpressionExtractor<MatchedExpression> extractor = CoreMapExpressionExtractor.createExtractorFromFile(env, "tr.txt");

因此,我的代码知道"了什么是开始"应该做的,即开始"应该触发/启动程序.但是开始"是指可以与诸如启动汽车"之类的任何东西一起使用.在这种情况下,我不想开始"执行.该程序,因为该句子与发动汽车有关,而不是该程序.为了解决这个问题,我使用了斯坦福大学的CollapsedDependenciesAnnotation类:

So my code "knows" what "Start" should do, that is, "Start" should trigger/start the program. But "Start" could be used with anything, like "Start the car." In this case, I wouldn't want to "Start" the program because the sentence is about starting a car, not the program. To solve this, I used Stanford's CollapsedDependenciesAnnotation class:

SemanticGraph dependencies = s.get(CollapsedDependenciesAnnotation.class);
Iterable<SemanticGraphEdge> edge_set = dependencies.edgeIterable();

我使用nsubj依赖项查看主题是否为PRP(代词),因为我希望仅当主题为PRP时才启动程序.因此,当我输入句子让我们开始"时,在我的程序中,程序启动了.但是,当我输入句子发动汽车"时,该程序没有启动.一切都很好...

I used the nsubj dependency to see if the subject was a PRP (pronoun) since I want the program to start only when the subject is a PRP. So when I inputed the sentence "let us start" in my program, the program started. However, when I inputed the sentence "Start the car," the program didn't start. All is working well...

但是,当我输入句子让他开始"时,程序也将启动. (正如刚才提到的). (因为他"也是代词而开始).我不希望在输入此句子时启动程序(因为让他启动"与启动程序无关).那么程序将如何知道呢?我该怎么做才能解决这个问题?是否有可以让计算机在让我们开始"与让我们开始"之间进行区分的算法?并让他开始"?

BUT the program will also start when I input the sentence "Let him start" (as mentioned above). (It starts because "him" is also a pronoun). I do not want the program to start when I input this sentence (because "Let him start" has nothing to do with the starting the program). So how will the program know this? What can I do to solve this problem? Are there algorithms that will let the computer differentiate between "let us start" and "let him start"?

关于如何解决此问题的任何想法?

Any ideas on how to solve this problem?

谢谢!

(我希望我很清楚)

推荐答案

Stanford CoreNLP可以帮助您的一种方法是其TokensRegex功能.使用此工具,您可以编写显式模式,然后在输入文本中对其进行标记.然后,您的代码可以根据某些模式的存在做出反应.

One way Stanford CoreNLP could help you is its TokensRegex functionality. With this tool you can write explicit patterns and then tag them in your input text. Then your code can react based on the presence of certain patterns.

以下是更多信息的链接:

Here are some links with more info:

http://nlp.stanford.edu/software/tokensregex.shtml

http://nlp.stanford.edu/software/regexner/

我建议您确定要处理的,可以得到明确答复的常用表达方式,并加以建立,以便对用户输入内容进行适当的介绍.

I would recommend identifying common expressions that you want to handle that deserve a clear response, and build up so you get decent coverage of what users input.

例如:

Let us (start|begin).
(Start|begin) the (program|software)
I'm ready to (start|begin)
etc...

显然,您可以组合这些规则,并使它们变得越来越复杂.但是我认为,一种直接的方法是考虑人们可能想表达的各种方式,然后用规则来捕捉它们.

Obviously you could combine these rules and make them increasingly complicated. But I think a straight forward approach would be to think of the various ways one might express they want to begin and then capture that with rules.

这篇关于自然语言理解算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆