计算F度量以进行聚类 [英] Computing F-measure for clustering

查看:158
本文介绍了计算F度量以进行聚类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以帮助我共同计算F值吗?我知道如何计算召回率和精度,但是不知道如何为给定算法计算一个F度量值。

Can anyone help me to calculate F-measure collectively ? I know how to calculate recall and precision, but don't know for a given algorithm how to calculate one F-measure value.

例如,假设我的算法创建了 m 簇,但是我知道有 n 个簇用于相同的数据(由另一个基准算法创建)。

As an exemple, suppose my algorithm creates m clusters, but I know there are n clusters for the same data (as created by another benchmark algorithm).

我找到了一个pdf,但是由于我得到的总价值大于1,所以它没有用。pdf的参考是 F度量说明。具体来说,我已经阅读了一些研究论文,其中作者基于F测度比较了两种算法,它们的总值介于0和1之间。
如果您仔细阅读上述pdf,公式为F( C,K)= ∑ | ci | / N * max {F(ci,kj)}

其中ci是参考簇& kj是由其他算法创建的簇,此处i从1到n& j从1到m。假设| c1 | = 218,根据pdf N = m * n假设m = 12和n = 10,我们得到j = 2的最大值F(c1,kj)。肯定F(c1,k2)在0到1之间。但是根据上述公式计算得出的结果值将大于1。

I found one pdf but it is not useful since the collective value I got is greater than 1. Reference of pdf is F Measure explained. Specifically I have read some research paper, in which the author compares two algorithms on the basis of F-measure, they got collectively values between 0 and 1. if you read the pdf mentioned above carefully, the formula is F(C,K) = ∑ | ci | / N * max {F(ci,kj)}
where ci is reference cluster & kj is cluster created by other algorithm, here i is running from 1 to n & j is running from 1 to m.Let say |c1|=218 here as per pdf N=m*n let say m=12 and n=10, and we got max F(c1,kj) for j=2. Definitely F(c1,k2) is between 0 and 1. but the resultant value calculated by above formula we will get value above 1.

推荐答案

因此例如给定集合


           D = {1, 2, 3, 4, 5, 6}

和分区,


           P = {1, 2, 3}, {4, 5}, {6}, and
           Q = {1, 2, 4}, {3, 5, 6}

其中P由我们的算法创建,Q由已知的标准算法创建

where P is set created by our algorithm and Q is set created by standard algorithm we known


           PairsP = {(1, 2), (1, 3), (2, 3), (4, 5)},
           PairsQ = {(1, 2), (1, 4), (2, 4), (3, 5), (3, 6), (5, 6)}, and
           PairsD = {(1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 3), (2, 4),
                      (2, 5), (2, 6), (3, 4), (3, 5), (3, 6), (4, 5), (4, 6), (5, 6)}

所以,


           a = | PairsP intersection PairsQ | = |(1, 2)| = 1
           b = | PairsP- PairsQ | = |(1, 3)(2, 3)(4, 5)| = 3
           c = | PairsQ- PairsP  | = |(1, 4)(2, 4)(3, 5)(3, 6)(5, 6)| = 5
         




     F-measure= 2a/(2a+b+c)

这篇关于计算F度量以进行聚类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆