从 pandas.rolling_apply 返回两个值 [英] Returning two values from pandas.rolling_apply

查看:188
本文介绍了从 pandas.rolling_apply 返回两个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 pandas.rolling_apply 将数据拟合到分布并从中获取值,但我还需要它报告滚动拟合优度(特别是 p 值).目前我是这样做的:

I am using pandas.rolling_apply to fit data to a distribution and get a value from it, but I need it also report a rolling goodness of fit (specifically, p-value). Currently I'm doing it like this:

def func(sample):
    fit = genextreme.fit(sample)
    return genextreme.isf(0.9, *fit)

def p_value(sample):
    fit = genextreme.fit(sample)
    return kstest(sample, 'genextreme', fit)[1]

values = pd.rolling_apply(data, 30, func)
p_values = pd.rolling_apply(data, 30, p_value)
results = pd.DataFrame({'values': values, 'p_value': p_values})

问题是我的数据很多,fit函数开销很大,所以不想每个样本都调用两次.我宁愿做的是这样的:

The problem is that I have a lot of data, and the fit function is expensive, so I don't want to call it twice for every sample. What I'd rather do is something like this:

def func(sample):
    fit = genextreme.fit(sample)
    value = genextreme.isf(0.9, *fit)
    p_value = kstest(sample, 'genextreme', fit)[1]
    return {'value': value, 'p_value': p_value}

results = pd.rolling_apply(data, 30, func)

其中结果是具有两列的 DataFrame.如果我尝试运行它,则会出现异常:TypeError:需要一个浮点数.是否有可能实现这一目标,如果有,如何实现?

Where results is a DataFrame with two columns. If I try to run this, I get an exception: TypeError: a float is required. Is it possible to achieve this, and if so, how?

推荐答案

我以前也遇到过类似的问题.这是我的解决方案:

I had a similar problem before. Here's my solution for it:

from collections import deque
class your_multi_output_function_class:
    def __init__(self):
        self.deque_2 = deque()
        self.deque_3 = deque()

    def f1(self, window):
        self.k = somefunction(y)
        self.deque_2.append(self.k[1])
        self.deque_3.append(self.k[2])
        return self.k[0]    

    def f2(self, window):
        return self.deque_2.popleft()   
    def f3(self, window):
        return self.deque_3.popleft() 

func = your_multi_output_function_class()

output = your_pandas_object.rolling(window=10).agg(
    {'a':func.f1,'b':func.f2,'c':func.f3}
    )

这篇关于从 pandas.rolling_apply 返回两个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆