当用户诠释多个类别的跨学科注释协议 [英] Inter annotator agreement when users annotates more than one category for any subject

查看:225
本文介绍了当用户诠释多个类别的跨学科注释协议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想找到一些注解的注释间协议。
注解诠释每个科目几大类(满分10类)。

I want to find the inter annotator agreement for few annotators. Annotators annotates few categories (out of 10 categories) for each subjects.

有关如有3个注释,10类和100个科目。

For e.g. there are 3 annotator , 10 categories and 100 subjects .

我知道如何 http://en.wikipedia.org/wiki/Cohen's_kappa (两年注解)和 http://en.wikipedia.org/wiki/Fleiss% 27_kappa (超过两注解)注释间协议,但我意识到,如果用户诠释多个类别的任何主题,他们可能无法正常工作。

I am aware about http://en.wikipedia.org/wiki/Cohen's_kappa (For two annotators) and http://en.wikipedia.org/wiki/Fleiss%27_kappa (for more than two annotators) inter annotator agreement but I realized that they may not work if user annotates more than one category for any subject.

做任何人有在此方案中确定注解间协议的任何想法。

Do anyone has any idea for determining inter annotation agreement in this scenario.

感谢

推荐答案

我不得不这样做了几年回来。我不能记得我究竟是如何做到了(我没有code了),但我有一个工作实例报告给我的教授。我正在处理的意见注释和有56种和4个注解。

i had to do this several years back. i cant recall how exactly i did it(i dont have code anymore) but i have a worked example to report to my professor. i was dealing with annotation of comments and have 56 categories and 4 annotators.

请注意:当时我需要一种方法来检测注解最不同意让每个注释会议结束后,他们可以专注于他们为什么不同意并制定了合理的规则,以最大限度地提高该统计。它的工作以及用于这一目的。

note:at the time i need a way to detect where annotators most disagree so that after each annotation session they can focus on why they disagree and set out reasonable rules to maximize this statistic. it worked well for that purpose

Let's assume A-D are annotators and 1-5 are categories. This is a possible scenario.

     A      B      C    D     Probability of agreement
1    X      X      X    X        4/4
2    X      X      X             3/4
3    X      X                    2/4
4    X                           1/4
5 

A tags this comment as 1,2,3,4 B->1,2,3, and so forth. 

For each category the probability of agreement is calculated. 

Which is then divided by the number of unique categories tagged for that particular comment.

Therefore for the example comment, we have 10/16 as annotator's agreement. This is a value between 0 and 1. 

如果你这并不工作,然后( HTTP://www.mit$p$pssjournals.org/doi/pdf/10.1162/coli.07-034-R2 )PG-567,这是由PG-587的案例研究中引用

if this doesnt work for you then (http://www.mitpressjournals.org/doi/pdf/10.1162/coli.07-034-R2) pg-567, which was referenced by pg-587 case study.

这篇关于当用户诠释多个类别的跨学科注释协议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆