计算数据集的准确性 [英] Calculate accuracy of a dataset

查看:182
本文介绍了计算数据集的准确性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个表(X和Y),例如将一个足球运动员映射到一个团队。表X中的数据可靠,但我不确定表Y中数据的可靠性。表X有3,000行,表Y有1,000行。如何通过将表Y中的数据用作真集或超集来计算表Y中的映射的精确度?

I have two tables (X and Y) that maps, say, a soccer player to a team. The data in table X is reliable but I am not sure about the reliability of the data in table Y. Table X has 3,000 rows and table Y has a 1,000. How can I calculate how accurate the mapping in table Y is by using the data in table Y as the truth set or superset?

表X

PlayerID   | Name      | Team
007        | Sancho    | Dortmund
010        | Messi     | Barcelona
011        | Werner    | Chelsea
001        | De Gea    | Man Utd
009        | Lewan..ki | Bayern Mun
006        | Pogba     | Man Utd
017        | De Bruyne | Man City
029        | Harvertz  | Chelsea
005        | Upamecano | Leipzig

表Y

PlayerID.   |Name      | Team
010         | Messi    | Man City
007         | Sancho   | Man Utd
006         | Pogba    | Man Utd
017         | De Bruyne| Man City
011         | Werner   | Liverpool
006         | Pogba    | Real Madrid

根据表X,我们可以看到只有运动员ID 006和017是准确的。但是,playerID 006由于映射到两个不同的团队而部分准确。

Based on Table X, we can see that only playerIDs 006 and 017 are accurate. However playerID 006 is partially accurate as it maps to two different teams.

推荐答案

您可以左加入并使用条件逻辑来计算准确性。

You can left join and use conditional logic to compute the accuracy.

在MySQL中,您可以这样表达:

In MySQL, you could phrase this as:

select avg(y.playerID is not null) as accuracy_ratio
from x
left join y 
    on  y.playerID = x.playerID
    and y.name     = x.name
    and y.team     = x.team

这将为您提供一个<$ c之间的值$ c> 0 和 1 ,它们表示准确率(如果需要百分比,可以将其乘以100)。

This gives you a value between 0 and 1, that represents the accuracy ratio (you can multiply it by 100 if you want a percentage).

请注意,这是以某种方式假定 playerID 唯一地标识两个表中的记录。

Note that this somehow assumes that playerID uniquely identify records in both tables.

这篇关于计算数据集的准确性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆