从句子中找出有意义的子句 [英] Finding meaningful sub-sentences from a sentence

查看:24
本文介绍了从句子中找出有意义的子句的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法找到一个句子的所有子句,这些子句仍然有意义并且至少包含一个主语、动词和一个谓语/宾语?

Is there a way to to find all the sub-sentences of a sentence that still are meaningful and contain at least one subject, verb, and a predicate/object?

例如,如果我们有这样的句子:下个月我将在奥斯汀的 SXSW 举办 NLP 研讨会".我们可以从这句话中提取以下有意义的子句:我要去参加一个研讨会"、我要去参加 NLP 研讨会"、我要去 SXSW 参加 NLP 研讨会"、我要在SXSW做一个研讨会",我要在Austin做一个研讨会",我下个月要做一个关于NLP的研讨会"等等.

For example, if we have a sentence like "I am going to do a seminar on NLP at SXSW in Austin next month". We can extract the following meaningful sub-sentences from this sentence: "I am going to do a seminar", "I am going to do a seminar on NLP", "I am going to do a seminar on NLP at SXSW", "I am going to do a seminar at SXSW", "I am going to do a seminar in Austin", "I am going to do a seminar on NLP next month", etc.

请注意,这里没有演绎出的句子(例如下个月将在 SXSW 举办 NLP 研讨会".虽然这是真的,但我们不需要将其作为此问题的一部分.).所有生成的句子都是给定句子的严格组成部分.

Please note that there is no deduced sentences here (e.g. "There will be a NLP seminar at SXSW next month". Although this is true, we don't need this as part of this problem.) . All generated sentences are strictly part of the given sentence.

我们如何解决这个问题?我正在考虑创建带注释的训练数据,其中为训练数据集中的每个句子都有一组合法的子句.然后编写一些监督学习算法来生成模型.

How can we approach solving this problem? I was thinking of creating annotated training data that has a set of legal sub-sentences for each sentence in the training data set. And then write some supervised learning algorithm(s) to generate a model.

我对 NLP 和机器学习很陌生,所以如果你们能提出一些解决这个问题的方法,那就太好了.

I am quite new to NLP and Machine Learning, so it would be great if you guys could suggest some ways to solve this problem.

推荐答案

有一篇名为 Hickl 等人的使用话语承诺来识别文本蕴涵",讨论了话语承诺(子句)的提取.该论文包括对他们的算法的描述,该算法在某种程度上对规则进行操作.他们将其用于 RTE,输出中可能会有一些最低限度的扣除.文本简化可能是一个需要关注的相关领域.

There's a paper titled "Using Discourse Commitments to Recognize Textual Entailment" by Hickl et al that discusses the extraction of discourse commitments (sub-sentences). The paper includes a description of their algorithm which in some level operates on rules. They used it for RTE, and there may be some minimal levels of deduction in the output. Text simplification maybe a related area to look at.

这篇关于从句子中找出有意义的子句的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆