带填充的 GroupBy 删除组并且不将组放入索引 [英] GroupBy with ffill deletes group and does not put group in index

查看:42
本文介绍了带填充的 GroupBy 删除组并且不将组放入索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

自从我将代码从一台计算机移植到另一台计算机后,我遇到了一个非常奇怪的问题.我在这个系统上使用的是 Pandas 0.25.1 版本,但我不确定我之前使用的 Pandas 版本.

I'm running into a very strange issue ever since I have ported my code from one computer to another. I'm using pandas version 0.25.1 on this system, but am unsure on the pandas version I was using previously.

问题如下:

我创建了一个简单的未排序(模拟)数据框,我想在该数据框上对值进行排序并向前填充所有 NaN 值.

I create a simple, unsorted (mock) dataframe on which I want to sort values and forward-fill all the NaN values.

In [1]: import pandas as pd
   ...: import numpy as np

In [2]: test = pd.DataFrame({"group" : ["A", "A", "A", "B", "B", "B", "C", "C"],
   ...:                      "count" : [2, 3, 1, 2, 1, 3, 1, 2],
   ...:                      "value" : [10, np.nan, 30, np.nan, 19, np.nan, 25, np.nan]})

In [3]: test
Out[3]:
  group  count  value
0     A      2   10.0
1     A      3    NaN
2     A      1   30.0
3     B      2    NaN
4     B      1   19.0
5     B      3    NaN
6     C      1   25.0
7     C      2    NaN

但是,当我这样做时,我会丢失整个组"列,并且它也不会重新出现在我的索引中.

However, when I do that I lose the entire "group" column, and it does not reappear in my index either.

In [4]: test.sort_values(["group", "count"]).groupby("group").ffill()
Out[4]:
   count  value
2      1   30.0
0      2   10.0
1      3   10.0
4      1   19.0
3      2   19.0
5      3   19.0
6      1   25.0
7      2   25.0

我还尝试使用 fillna 使用以下内容,但这给了我相同的结果:

I've also tried to use the following using fillna, but that gives me the same result:

In [5]: test.sort_values(["group", "count"]).groupby("group").fillna(method = "ffill")
Out[5]:
   count  value
2      1   30.0
0      2   10.0
1      3   10.0
4      1   19.0
3      2   19.0
5      3   19.0
6      1   25.0
7      2   25.0

有谁知道我做错了什么?问题似乎与 ffill 方法有关,因为我可以在 groupby 上使用 .mean() 并保留我的分组.

Does anyone know what I am doing wrong? The issue seems to be with the ffill method, since I CAN use .mean() on the groupby and retain my groupings.

推荐答案

IICU,你必须使用 'update` 将结果返回到数据框

IICU, you have to use 'update` to get the results back to the dataframe

test.update(test.sort_values(["group", "count"]).groupby("group").ffill())
print(test)

输出

group   count   value
0   A   2   10.0
1   A   3   10.0
2   A   1   30.0
3   B   2   19.0
4   B   1   19.0
5   B   3   19.0
6   C   1   25.0
7   C   2   25.0

这篇关于带填充的 GroupBy 删除组并且不将组放入索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆