pandas groupby和pct更改未返回期望值 [英] Pandas groupby and pct change not returning expected value

查看:72
本文介绍了 pandas groupby和pct更改未返回期望值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于以下数据框中的每个名称,我试图查找从一个 Time 金额列的下一个:





创建数据框的代码:

 将熊猫作为pd 
$导入b $ b df = pd.DataFrame({'Name':['Ali','Ali','Ali','Cala','Cala','Cala','Elena','Elena','Elena' ],
'时间':[1、2、3、1、2、3、1、2、3],
'金额':[24、52、34、95、98、54 ,32,20,16]})

df.sort_values(['Name','Time'],inplace = True)

我尝试的第一种方法(基于



这似乎没有按名称分组,因为它的结果与我没有使用 groupby ,并称为 df ['Amount']。pct_change()。根据



这次所有百分比变化都是正确。



为什么 groupby pct_change 方法不返回正确的值,但是将 groupby apply 一起使用会吗?



编辑2018年1月28日:此行为已在最新版本的Pandas 0.24.0中得到纠正。要安装,请运行 pip install -U pandas

解决方案

由@piRSquared在评论中;这是由于在Github上根据问题#21621 提交的错误。它似乎已经在里程碑 0.24.0 中解决(由于2018年12月31日)。我的版本( 0.23.4 )仍显示此错误行为。


For each Name in the following dataframe I'm trying to find the percentage change from one Time to the next of the Amount column:

Code to create the dataframe:

import pandas as pd

df = pd.DataFrame({'Name': ['Ali', 'Ali', 'Ali', 'Cala', 'Cala', 'Cala', 'Elena', 'Elena', 'Elena'],
                   'Time': [1, 2, 3, 1, 2, 3, 1, 2, 3],
                   'Amount': [24, 52, 34, 95, 98, 54, 32, 20, 16]})

df.sort_values(['Name', 'Time'], inplace = True)

The first approach I tried (based on this question and answer) used groupby and pct_change:

df['pct_change'] = df.groupby(['Name'])['Amount'].pct_change()

With the result:

This doesn't seem to be grouping by the name because it is the same result as if I had used no groupby and called df['Amount'].pct_change(). According to the Pandas Documentation for pandas.core.groupby.DataFrameGroupBy.pct_change, the above approach should work to calculate the percentage change of each value to the previous value within a group.

For a second approach I used groupby with apply and pct_change:

df['pct_change_with_apply'] = df.groupby('Name')['Amount'].apply(lambda x: x.pct_change())

With the result:

This time all the percentage changes are correct.

Why does the groupby and pct_change approach not return the correct values, but using groupby with apply does?

Edit January 28, 2018: This behavior has been corrected in the latest version of Pandas, 0.24.0. To install run pip install -U pandas.

解决方案

As already noted by @piRSquared in the comments; this is due to a bug filed on Github under issue #21621. It already looks to be solved in milestone 0.24.0 (due 2018-12-31). My version (0.23.4) still displayed this bugged behaviour.

这篇关于 pandas groupby和pct更改未返回期望值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆