缩短的文字,只保留重要的句子 [英] Shorten a text and only keep important sentences

查看:154
本文介绍了缩短的文字,只保留重要的句子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

德国网站nandoo.net提供缩短一篇新闻文章的可能性。如果你有一个滑块更改百分比值,文字的变化和一些句子被遗漏了。

The German website nandoo.net offers the possibility to shorten a news article. If you change the percentage value with a slider, the text changes and some sentences are left out.

您可以看到在这里的行动:

You can see that in action here:

http://www.nandoo.net/read/article/299925/

的新闻文章是在左侧和标签标记。滑块是在第二列的顶部。你越是将滑块移动到左边,较短的文字就越大。

The news article is on the left side and tags are marked. The slider is on the top of the second column. The more you move the slider to the left, the shorter the text becomes.

你怎么能提供类似的东西?是否有任何的算法,你可以用它来实现这一目标?

How can you offer something like that? Are there any algorithms which you can use to achieve that?

我的想法是,他们的算法计算的标签和名词在一个句子的数量。然后用最少的标签/名词的句子被遗漏了。

My idea was that their algorithm counts the number of tags and nouns in a sentence. Then the sentences with fewest number of tags/nouns are left out.

莫非是真的吗?或者你有其他的想法?

Could that be true? Or do you have another idea?

我希望你能帮助我。在此先感谢!

I hope you can help me. Thanks in advance!

推荐答案

通常你要保留有话是比较独特的,以该文章中的句子。

Usually you want to keep the sentences that have words that are more unique to that article.

也就是说,更多的通用的句子,就越它描述了这种特殊商品。

That is, the more "generic" the sentence is, the less it describes this particular article.

正常的方式来做到这一点是贝叶斯分析很像垃圾邮件过滤器。首先确定这整篇文章中的话往往比你所期望的,那么发现这些特色词的句子出现。

The normal way to do this is Bayesian analysis much like a spam-filter. First determine which words in the entire article appear more often than you'd expect, then find the sentences that feature those words.

这篇关于缩短的文字,只保留重要的句子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆