Numpy 中的经验分布函数 [英] Empirical Distribution Function in Numpy

查看:26
本文介绍了Numpy 中的经验分布函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下值列表:

x = [-0.04124324405924407, 0, 0.005249724476788287, 0.03599351958245578, -0.00252785423151014, 0.01007584102031178, -0.002510349639322063,...]

我想计算经验密度函数,所以我想我需要计算经验累积分布函数,我使用了这个代码:

and I want to calculate the empirical density function, so I think I need to calculate the empirical cumulative distribution function and I've used this code:

counts = np.asarray(np.bincount(x), dtype=float)
cdf = counts.cumsum() / counts.sum()

然后我计算这个值:

print cdf[0.01007584102031178]

而且我总是得到 1,所以我想我犯了一个错误.你知道如何解决吗?谢谢!

and I always get 1 so I guess I made a mistake. Do you know how to fix it? Thanks!

推荐答案

经验 cdf 的通常定义是小于或等于给定值的观测数除以总观测数.使用一维 numpy 数组,这是 x[x <= v].size/x.size(浮点除法,在 python2 中你需要 from __future__ import Division):>

The usual definition of the empirical cdf is the number of observations lesser than or equal to the given value divided by the total number of observations. Using 1d numpy arrays this is x[x <= v].size / x.size (float division, in python2 you need from __future__ import division):

x = np.array([-0.04124324405924407,  0,
               0.005249724476788287, 0.03599351958245578,
              -0.00252785423151014,  0.01007584102031178,
              -0.002510349639322063])
v = 0.01007584102031178
print(x[x <= v].size / x.size)

将打印 0.857142857143,(如果 0.01007584102031178 处的经验 cdf 为 6/7,则为实际值).

Will print 0.857142857143, (the actual value if the empirical cdf at 0.01007584102031178 is 6 / 7).

如果您的数组很大并且您需要计算多个值的 cdf,这将非常昂贵.在这种情况下,您可以保留数据的排序副本并使用 np.searchsorted() 找出观察次数 <= v:

This is quite expensive if your array is large and you need to compute the cdf for several values. In such cases you can keep a sorted copy of your data and use np.searchsorted() to find out the number of observations <= v:

def ecdf(x):
    x = np.sort(x)
    def result(v):
        return np.searchsorted(x, v, side='right') / x.size
    return result

cdf = ecdf(x)
print(cdf(v))

这篇关于Numpy 中的经验分布函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆