如何自动检测首字母缩写词的含义/扩展名 [英] how to automatically detect acronym meaning / extension

查看:243
本文介绍了如何自动检测首字母缩写词的含义/扩展名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用NLP/信息提取(IE)方法检测/找出首字母缩写词的含义(扩展名)?

How can you detect / find out the meaning (the extension) of an acronym using NLP / Information Extraction (IE) methods?

我们想在自由文本中检测是否使用了单词或首字母缩写词,并将其映射到相同的实体/令牌.

We want to detect in free text if a word or it's acronym is used and map it to the same entity / token.

在线上提供的大多数论文都是关于医学首字母缩写词的,它们没有提供完成此任务的库.

Most papers available online are about medical acronyms and they do not provide a library for acomplish this task.

有什么想法吗?

推荐答案

阅读您的问题和评论,我了解您想创建从首字母缩写词到其扩展名的映射.

Reading your question and the comments I understand that you want to create a mapping from an acronym to its extension.

假设您有一个同时出现首字母缩写词和其扩展名的文本文档集合,则可以应用一种算法来提取(缩写词,扩展名)对.

Assuming you have a collection of textual documents where both the acronym and its expansion occur you can apply an algorithm to extract (acronym,extension) pairs.

由AS Schwartz和MA Hearst提出的用于识别生物医学文本中缩写定义的简单算法 ,正是通过查看模式来做到这一点.可在此处获得.

A Simple Algorithm for Identifying Abbreviation Definitions in Biomedical Text by A.S Schwartz and M.A. Hearst, does exactly this by looking at patterns. The Java implementation is available here.

我将此算法应用于英语维基百科,您可以看到结果这里.我还将其应用于一系列葡萄牙语新文章,结果在此处.

I applied this algorithm to the English Wikipedia, you can see the results here. I also applied it to a collection of Portuguese new articles, results are here.

这篇关于如何自动检测首字母缩写词的含义/扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆