从邀请文本中检测时间,日期和地点的算法 [英] algorithm to detect time, date and place from invitation text
问题描述
我正在研究一些自然语言处理算法来阅读一段文本,如果该文本似乎在试图建议会议要求,它将自动为您设置该会议.
I am researching some Natural Language Processing algorithms to read a piece of text, and if the text seems to be trying to suggest a meeting request, it sets up that meeting for you automatically.
例如,如果电子邮件文本显示为:
For example, if an email text reads:
让我们明天在市区晚上7点的某个地方".
Let's meet tomorrow someplace in Downtown at 7pm".
该算法应该能够检测事件的时间,日期和地点.
The algorithm should be able to detect the Time, date and place of the event.
有人知道我可以为此目的使用一些已经存在的NLP算法吗?我一直在研究一些NLP资源(例如 NLTK 和
Does someone know of some already existing NLP algorithms that I could use for this purpose? I have been researching some NLP resources (like NLTK and some tools in R), but did not have much success.
谢谢
推荐答案
这是信息提取,并且可以使用隐马尔可夫模型(HMM)或条件随机场(CRF)等序列分割算法进行更具体的解决.
This is an application of information extraction, and can be solved more specifically with sequence segmentation algorithms like hidden Markov models (HMMs) or conditional random fields (CRFs).
对于软件实施,您可能要从UMass-Amherst的 MALLET工具包开始,这是一个流行的库,它实现了用于信息提取的CRF.
For a software implementation, you might want to start with the MALLET toolkit from UMass-Amherst, it's a popular library that implements CRFs for information extraction.
您将把句子中的每个标记视为要用您感兴趣的字段标记的内容(或以上都不是'x'),作为单词特征的函数(例如词性,大写字母,字典)成员资格等).类似这样的内容:
You would treat each token in a sentence as something to be labeled with the fields you are interested in (or 'x' for none of the above), as a function of word features (like part of speech, capitalization, dictionary membership, etc.)... something like this:
token label features
-----------------------------------
Let x POS=NNP, capitalized
's x POS=POS
meet x POS=VBP
tomorrow DATE POS=NN, inDateDictionary
someplace x POS=NN
in x POS=IN
Downtown LOCATION POS=NN, capitalized
at x POS=IN
7pm TIME POS=CD, matchesTimeRegex
. x POS=.
不过,您将需要首先提供一些带有手工标记的培训数据.
You will need to provide some hand-labeled training data first, though.
这篇关于从邀请文本中检测时间,日期和地点的算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!