Pandas groupby 顺序值 [英] Pandas groupby sequential values
问题描述
我不知道如何调用这个操作,所以我无法真正用谷歌搜索任何东西,但这是我想要做的:
我有这个数据框:
df = pd.DataFrame({"name": ["A", "B", "B", "B", "A", "A", "B"], "value":[3, 1, 2, 0, 5, 2, 3]})df名称值0 一 31 乙 12 乙 23 乙 04 一个 55 一个 26 乙 3
我想将它分组在 df.name
上,并在 df.values
上应用 max
函数,但前提是名称在顺序.所以我想要的结果如下:
df.groupby_sequence("name")["value"].agg(max)名称值0 一 31 乙 22 一个 53 乙 3
知道怎么做吗?
使用pandas
,你可以在名称逐行变化时进行分组,使用(df.name!=df.name.shift()).cumsum()
,它基本上将连续的名称组合在一起:
I have no idea how to call this operation, so I couldn't really google anything, but here's what I'm trying to do:
I have this dataframe:
df = pd.DataFrame({"name": ["A", "B", "B", "B", "A", "A", "B"], "value":[3, 1, 2, 0, 5, 2, 3]})
df
name value
0 A 3
1 B 1
2 B 2
3 B 0
4 A 5
5 A 2
6 B 3
And I want to group it on df.name
and apply a max
function on df.values
but only if the names are in sequence. So my desired result is as follows:
df.groupby_sequence("name")["value"].agg(max)
name value
0 A 3
1 B 2
2 A 5
3 B 3
Any clue how to do this?
Using pandas
, you can groupby when the name changes from row to row, using (df.name!=df.name.shift()).cumsum()
, which essentially groups together consecutive names:
>>> df.groupby((df.name!=df.name.shift()).cumsum()).max().reset_index(drop=True)
name value
0 A 3
1 B 2
2 A 5
3 B 3
这篇关于Pandas groupby 顺序值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!