从句子中找到有意义的子句 [英] Finding meaningful sub-sentences from a sentence

查看:71
本文介绍了从句子中找到有意义的子句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法找到一个句子的所有子句,这些子句仍然有意义,并且包含至少一个主语,动词和谓语/宾语?

Is there a way to to find all the sub-sentences of a sentence that still are meaningful and contain at least one subject, verb, and a predicate/object?

例如,如果我们有一个句子,例如下个月我将在奥斯汀的SXSW举办有关NLP的研讨会".我们可以从这句话中提取出以下有意义的子句:我要去参加一个研讨会",我要去参加一个关于NLP的研讨会",我要去一个在SXSW举行的关于NLP的研讨会",我将在SXSW举行一个研讨会",我将在奥斯汀举行一个研讨会",下个月我将在NLP举行一个研讨会"等.

For example, if we have a sentence like "I am going to do a seminar on NLP at SXSW in Austin next month". We can extract the following meaningful sub-sentences from this sentence: "I am going to do a seminar", "I am going to do a seminar on NLP", "I am going to do a seminar on NLP at SXSW", "I am going to do a seminar at SXSW", "I am going to do a seminar in Austin", "I am going to do a seminar on NLP next month", etc.

请注意,这里没有推论语句(例如下个月将在SXSW举行NLP研讨会".尽管这是事实,但我们不需要将此作为问题的一部分.).所有生成的句子严格都是给定句子的一部分.

Please note that there is no deduced sentences here (e.g. "There will be a NLP seminar at SXSW next month". Although this is true, we don't need this as part of this problem.) . All generated sentences are strictly part of the given sentence.

我们如何解决这个问题?我当时正在考虑创建带注释的训练数据,该数据对于训练数据集中的每个句子都有一组合法的子句.然后编写一些监督学习算法以生成模型.

How can we approach solving this problem? I was thinking of creating annotated training data that has a set of legal sub-sentences for each sentence in the training data set. And then write some supervised learning algorithm(s) to generate a model.

我对NLP和机器学习非常陌生,所以如果你们能提出一些解决此问题的方法,那将是很棒的.

I am quite new to NLP and Machine Learning, so it would be great if you guys could suggest some ways to solve this problem.

推荐答案

有一篇论文名为,其中讨论了话语承诺(子句)的提取.本文包括对它们的算法的描述,该算法在一定程度上根据规则进行操作.他们将其用于RTE,并且输出中的扣除额可能很小.简化文字可能是需要研究的相关领域.

There's a paper titled "Using Discourse Commitments to Recognize Textual Entailment" by Hickl et al that discusses the extraction of discourse commitments (sub-sentences). The paper includes a description of their algorithm which in some level operates on rules. They used it for RTE, and there may be some minimal levels of deduction in the output. Text simplification maybe a related area to look at.

这篇关于从句子中找到有意义的子句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆