在 Pandas 中计算奇数比的更好方法 [英] A Better Way to Calculate Odd Ratio in Pandas

查看:76
本文介绍了在 Pandas 中计算奇数比的更好方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据帧 counts1,它看起来像:

I have a dataframe counts1 which looks like:

Factor            w-statin  wo-statin
AgeGroups Cancer                     
0-5       No           108       6575
          Yes            0        223
11-15     No             5       3669
          Yes            1        143
16-20     No            28       6174
          Yes            1        395
21-25     No            80       8173
          Yes            2        624
26-30     No           110       9143
          Yes            2        968
30-35     No           171       9046
          Yes            5       1225
35-40     No           338       8883
          Yes           21       1475

我想计算比值比(w-statin/wo-statin).我做的是旧式的,就像我在纸上做的一样:

I wanted to calculate the oddsratio (w-statin/wo-statin). I did it old style like I would do it in paper:

counts1['sumwwoStatin']= counts1['w-statin']+counts1['wo-statin']

counts1['oddRatio']=((counts1['w-statin']/counts1['sumwwoStatin'])/(counts1['wo-statin']/counts1['sumwwoStatin']))

是否有更好的方法来计算优势比、相对风险、应急表等?Pandas 中的卡方检验,就像在 R 中一样?任何建议表示赞赏.哦,顺便说一句,我忘了提到我的 csv 是什么样子的:

Is there a better way to calculate Odds-ratio, Relative risk, Contigency Table, & Chi-Square Tests in Pandas, just like in R? Any suggestions are appreciated. Oh by the way, I forgot to mention how my csv looks like:

    Frequency Cancer     Factor AgeGroups
0         223    Yes  wo-statin       0-5
1         112    Yes  wo-statin      6-10
2         143    Yes  wo-statin     11-15
3         395    Yes  wo-statin     16-20
4         624    Yes  wo-statin     21-25
5         968    Yes  wo-statin     26-30
6        1225    Yes  wo-statin     30-35
7        1475    Yes  wo-statin     35-40
8        2533    Yes  wo-statin     41-45
9        4268    Yes  wo-statin     46-50
10       5631    Yes  wo-statin     52-55
11       6656    Yes  wo-statin     56-60
12       7166    Yes  wo-statin     61-65
13       8573    Yes  wo-statin     66-70
14       8218    Yes  wo-statin     71-75
15       4614    Yes  wo-statin     76-80
16       1869    Yes  wo-statin     81-85
17        699    Yes  wo-statin     86-90
18        157    Yes  wo-statin     91-95
19         31    Yes  wo-statin    96-100
20          5    Yes  wo-statin      >100
21        108     No   w-statin       0-5
22          6     No   w-statin      6-10
23          5     No   w-statin     11-15
24         28     No   w-statin     16-20
25         80     No   w-statin     21-25
26        110     No   w-statin     26-30
27        171     No   w-statin     30-35
28        338     No   w-statin     35-40
29        782     No   w-statin     41-45
..

推荐答案

AFAIK pandas 不提供统计计算和测试,除了基本矩,如均值、方差、相关性等......

AFAIK pandas does not provide statistical computations and tests except basic moments like mean, variance, correlations etc...

但是,您可以依靠 scipy 对于这个要求.你会在那里找到大部分你需要的东西.例如,要计算优势比:

However, you can rely on scipy for this requirement. You'll find most of what you need there. For instance, to calculate the odds ratio:

import scipy.stats as stats

table = df.groupby(level="Cancer").sum().values
print(table)

>>> array([[  840, 51663],
           [   32,  5053]])

oddsratio, pvalue = stats.fisher_exact(table)
print("OddsR: ", oddsratio, "p-Value:", pvalue)

>>> OddsR:  2.56743220487 p-Value: 2.72418938361e-09

请参阅此处在这里了解更多.

See here and here for more.

这篇关于在 Pandas 中计算奇数比的更好方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆