提取相关句子到实体 [英] Extract relevant sentences to entity

查看:106
本文介绍了提取相关句子到实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您是否知道NLP中的某些论文或算法能够从与给定实体(术语)相关的文本中提取句子.我想处理一些评论(主要是技术方面的),但是我发现许多评论都提到了不止一种产品(他们做了比较).我想从该文本中仅提取与一种产品相关的句子,或删除与特定命名实体(产品)无关的句子.

Do you know some paper or algorithm in NLP that is able to extract sentences from text that are related to given entity (term). I would like to process some reviews (mainly tech), but I found out that many reviews mention more then one product (they do comparation). I would like to extract from that text just sentences that are relevant to one product, or delete sentences that are irrelevant to particular named entity (product).

我的追问是怎么做的?有一些相关的论文吗?某些工具包或api可以完成这种操作吗?

My questin is how to do it? Is there some related papers? Is something like this done by some toolkit or api?

推荐答案

您需要的是命名实体识别器(NER).给定输入句子,NER会将句子中的各个实体标识为个人,组织,产品等.然后,您可以检查被识别为产品的实体,并相应地保留或丢弃该句子.一种非常简单的可能性是在Python中使用NLTK的命名实体识别器.这是一个示例:

What you want is a Named Entity Recognizer (NER). Given an input sentence, the NER will identify the various entities in the sentence as persons, organizations, products etc. You can then check entities recognized as products, and keep or discard the sentence accordingly. One very simple possibility would be to use the named entity recognizer of NLTK in Python. Here is an example:

import nltk
sent = "Albert Einstein spent many years at Princeton University in New Jersey"
sent1 = nltk.word_tokenize(sent)
sent2 = nltk.pos_tag(sent1)
sent3 = nltk.ne_chunk(sent2)
print sent3

输出将是:

(S
  (PERSON Albert/NNP)
  (PERSON Einstein/NNP)
  spent/VBD
  many/JJ 
  years/NNS
  at/IN
  (ORGANIZATION Princeton/NNP University/NNP)
  in/IN
  (GPE New/NNP Jersey/NNP))

NLTK在此简单示例中效果很好,但是老实说,我不确定它的准确性或是否可以针对您的目的对其进行定制(标识产品).但是我知道斯坦福NER 既可自定义,又是准确的,所以您可能想要看看上面的链接.

NLTK works well for this simple example, but to be honest I'm not sure how accurate it is or if it can be customized to fit your purposes (identifying products). But I know that the Stanford NER is both customizable and accurate, so you might want to have a look at the above link.

这篇关于提取相关句子到实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆