用 SpaCy 中的标签替换实体 [英] Replace entity with its label in SpaCy

查看:77
本文介绍了用 SpaCy 中的标签替换实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SpaCy 是否有将 SpaCy NER 检测到的实体替换为其标签的方法?例如:我在玩我的 Apple Macbook 时正在吃一个苹果.

Is there anyway by SpaCy to replace entity detected by SpaCy NER with its label? For example: I am eating an apple while playing with my Apple Macbook.

我已经用 SpaCy 训练了 NER 模型来检测FRUITS"实体,并且该模型成功地将第一个苹果"检测为水果",而不是第二个苹果".

I have trained NER model with SpaCy to detect "FRUITS" entity and the model successfully detects the first "apple" as "FRUITS", but not the second "Apple".

我想通过用标签替换每个实体来对我的数据进行后处理,所以我想用水果"替换第一个苹果".句子将是我在玩我的 Apple Macbook 时正在吃水果."

I want to do post-processing of my data by replacing each entity with its label, so I want to replace the first "apple" with "FRUITS". The sentence will be "I am eating an FRUITS while playing with my Apple Macbook."

如果我只是使用正则表达式,它也会用FRUITS"替换第二个Apple",这是不正确的.有什么聪明的方法可以做到这一点吗?

If I simply use regex, it will replace the second "Apple" with "FRUITS" as well, which is incorrect. Is there any smart way to do this?

谢谢!

推荐答案

实体标签是token的一个属性(见此处)

the entity label is an attribute of the token (see here)

import spacy
from spacy import displacy
nlp = spacy.load('en_core_web_lg')

s = "His friend Nicolas is here."
doc = nlp(s)

print([t.text if not t.ent_type_ else t.ent_type_ for t in doc])
# ['His', 'friend', 'PERSON', 'is', 'here', '.']

print(" ".join([t.text if not t.ent_type_ else t.ent_type_ for t in doc]) )
# His friend PERSON is here .

为了处理实体可以跨越多个单词的情况,可以使用以下代码:

In order to handle cases were entities can span several words the following code can be used instead:

s = "His friend Nicolas J. Smith is here with Bart Simpon and Fred."
doc = nlp(s)
newString = s
for e in reversed(doc.ents): #reversed to not modify the offsets of other entities when substituting
    start = e.start_char
    end = start + len(e.text)
    newString = newString[:start] + e.label_ + newString[end:]
print(newString)
#His friend PERSON is here with PERSON and PERSON.

这篇关于用 SpaCy 中的标签替换实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆