斯坦福大学CRFClassifier性能评估输出 [英] Stanford CRFClassifier performance evaluation output
问题描述
我正在关注以下常见问题解答 https://nlp.stanford.edu/software/crf-faq.shtml 来训练我自己的分类器,我注意到性能评估输出与结果不匹配(或者至少与我期望的方式不匹配). 特别是本节
I'm following this FAQ https://nlp.stanford.edu/software/crf-faq.shtml for training my own classifier and I noticed that the performance evaluation output does not match the results (or at least not in the way I expect). Specifically this section
CRFClassifier tagged 16119 words in 1 documents at 13824.19 words per second.
Entity P R F1 TP FP FN
MYLABEL 1.0000 0.9961 0.9980 255 0 1
Totals 1.0000 0.9961 0.9980 255 0 1
CRFClassifier tagged 16119 words in 1 documents at 13824.19 words per second.
Entity P R F1 TP FP FN
MYLABEL 1.0000 0.9961 0.9980 255 0 1
Totals 1.0000 0.9961 0.9980 255 0 1
我希望TP
是预测标签与金色标签匹配的所有实例,FP
是所有预测MYLABEL
但金色标签为O
的实例,FN
是所有实例可以预测到O
,但是最黄金的是MYLABEL
.
I expect TP
to be all instances where the predicted label matched the golden label, FP
to be all instances where MYLABEL
was predicted but the golden label was O
, FN
to be all instances where O
was predicted but the golden was MYLABEL
.
如果我自己从程序的输出中计算出这些数字,我得到的数字将完全不同,而与程序的打印内容无关.我已经尝试过各种测试文件.
我正在使用Stanford NER - v3.7.0 - 2016-10-31
If I calculate those numbers myself from the output of the program, I get completely different numbers with no relation to what the program prints. I've tried this with various test files.
I'm using Stanford NER - v3.7.0 - 2016-10-31
我想念什么吗?
推荐答案
F1分数超过实体而不是标签.
The F1 scores are over entities not labels.
示例:
(Joe, PERSON) (Smith, PERSON) (went, O) (to, O) (Hawaii, LOCATION) (., O).
在此示例中,有两个可能的实体:
In this example there are two possible entities:
Joe Smith PERSON
Hawaii LOCATION
实体是通过使用具有相同标签的所有相邻标记创建的. (除非您使用更复杂的BIO标记方案; BIO方案具有I-PERSON和B-PERSON之类的标签,以指示令牌是否为实体的开头,等等.)
Entities are created by taking all adjacent tokens with the same label. (Unless you use a more complicated BIO labeling scheme ; BIO schemes have tags like I-PERSON and B-PERSON to indicate whether a token is the beginning of an entity, etc...).
这篇关于斯坦福大学CRFClassifier性能评估输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!