如何在crf ++中将地名词典或词典表示为功能? [英] how to represent gazetteers or dictionaries as features in crf++?

查看:110
本文介绍了如何在crf ++中将地名词典或词典表示为功能?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用地名词典或词典作为 CRF ++ 中的功能?

how to use gazetteers or dictionaries as features in CRF++?

详细说明:假设我想对人名进行NER,并且我有一个包含经常见到的人名的地名词典(或词典),我想将此地名词典用作crf ++的输入,我该怎么做?

To elaborate: suppose I want to do NER on person names, and I am having a gazetteer (or dictionary) containing commonly seen person names, I want to use this gazetteer as an input to crf++, how can I do that?

我正在使用条件随机字段包crf ++来执行命名实体识别任务. 我知道如何在crf ++中表示一些常用功能.例如,如果我们要使用大写字母作为特征,则可以在crf的特征模板中添加一个单独的列,以指示单词是否被大写.

I am using the conditional random field package crf++ to perform named entity recognition tasks. I know how to represent some commonly used features in crf++. For example, if we want to use Capitalization as a feature, we can add one separate column in the feature template of crf indicating if a word is capitalized or not.

推荐答案

您可以创建一个新功能,该功能指示令牌是否在字典/凝视器中.只需检查设置的成员资格并将Gazeteer功能设置为1或0.

You could make a new feature that indicates if a token is in the dictionary/gazeteer. Just check for set membership and set the Gazeteer feature to 1 or 0.

这篇关于如何在crf ++中将地名词典或词典表示为功能?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆