如何显示NDCG分数很重要 [英] how to show that NDCG score is significant

查看:50
本文介绍了如何显示NDCG分数很重要的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我的检索系统的NDCG分数是0.8.我如何解释这个分数.我如何告诉读者这个分数很重要?

Suppose the NDCG score for my retrieval system is .8. How do I interpret this score. How do i tell the reader that this score is significant?

推荐答案

要了解这一点,我们来查看一个标准化折扣累积增益(nDCG)的示例.
对于nDCG,我们需要DCG和理想DCG(IDCG)
首先让我们了解什么是累积增益(CG),

To understand this lets check an example of Normalized Discounted Cumulative Gain (nDCG)
For nDCG we need DCG and Ideal DCG (IDCG)
Lets understand what is Cumulative Gain (CG) first,

Example: Suppose we have [Doc_1, Doc_2, Doc_3, Doc_4, Doc_5]
Doc_1 is 100% relevant
Doc_2 is 70% relevant
Doc_3 is 95% relevant
Doc_4 is 20% relevant
Doc_5 is 100% relevant

所以我们的累积增益(CG)是

So our Cumulative Gain (CG) is

CG = 100 + 70 + 95 + 20 + 100  ###(Index of the doc doesn't matter)
   = 385


折扣累积收益(DCG)为

and
Discounted cumulative gain (DCG) is

DCG = SUM( relivencyAt(index) / log2(index + 1) ) ###where index 1 -> 5

Doc_1 is 100 / log2(2) = 100.00
Doc_2 is 70  / log2(3) = 044.17
Doc_3 is 95  / log2(4) = 047.50
Doc_4 is 20  / log2(5) = 008.61
Doc_5 is 100 / log2(6) = 038.69

DCG = 100 + 44.17 + 47.5 + 8.61 + 38.69
DCG = 238.97

理想DCG为

IDCG = Doc_1 , Doc_5, Doc_3, Doc_2, Doc_4

Doc_1 is 100 / log2(2) = 100.00
Doc_5 is 100 / log2(3) = 063.09
Doc_3 is 95  / log2(4) = 047.50
Doc_2 is 75  / log2(5) = 032.30
Doc_4 is 20  / log2(6) = 007.74

IDCG = 100 + 63.09 + 47.5 + 32.30 + 7.74
IDCG = 250.63

nDCG(5) = DCG    / IDCG
        = 238.97 / 250.63
        = 0.95

结论:

在给定的示例中,nDCG为0.95,0.95不是预测准确性,0.95是有效文档的排名.因此,收益是从结果列表的顶部到底部累积的,每个结果的收益在较低的等级被打折.
Wiki参考

In the given example nDCG was 0.95, 0.95 is not prediction accuracy, 0.95 is the ranking of the document effective. So, the gain is accumulated from the top of the result list to the bottom, with the gain of each result discounted at lower ranks.
Wiki reference

这篇关于如何显示NDCG分数很重要的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆