以特定顺序在 pandas 数据框中移动行 [英] Shift rows in pandas dataframe in a specific order
问题描述
我有一个如下所示的 Pandas 数据框:
I have a pandas dataframe which looks like this:
df = pd.DataFrame({
'job': ['football','football', 'football', 'basketball', 'basketball', 'basketball', 'hokey', 'hokey', 'hokey', 'football','football', 'football', 'basketball', 'basketball', 'basketball', 'hokey', 'hokey', 'hokey'],
'team': [4.0,5.0,9.0,2.0,3.0,6.0,1.0,7.0,8.0, 4.0,5.0,9.0,2.0,3.0,6.0,1.0,7.0,8.0],
'cluster': [0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1]
})
每个cluster
包含 9 个团队.每个集群有 3 支球队,每种运动类型football
、basketball
和 hokey
.我想对每个集群应用一个移位函数,以便团队的顺序以非常特定的方式发生(我尝试用颜色突出显示):
Each cluster
contains 9 teams. Each cluster has 3 teams of each type of sport football
, basketball
and hokey
. I want to apply a shift-function to each cluster, so that the order of teams chance in a very specific way (I tried to highlight it with color):
如何为更大的数据帧执行此转换(以上述方式移动行)?
How can I do this transformation (shift rows in a way shown above) for a much larger dataframe?
推荐答案
让我们用 groupby
+ cumcount
来创建一个基于列的顺序计数器 cluster
和 job
然后使用 sort_values
对 cluster
和这个 counter
上的数据帧进行排序:
Let's do groupby
+ cumcount
to create a sequential counter based on the columns cluster
and job
then use sort_values
to sort the dataframe on cluster
and this counter
:
df['j'] = df.groupby(['cluster', 'job']).cumcount()
df = df.sort_values(['cluster', 'j'], ignore_index=True).drop('j', axis=1)
job team cluster
0 football 4.0 0
1 basketball 2.0 0
2 hokey 1.0 0
3 football 5.0 0
4 basketball 3.0 0
5 hokey 7.0 0
6 football 9.0 0
7 basketball 6.0 0
8 hokey 8.0 0
9 football 4.0 1
10 basketball 2.0 1
11 hokey 1.0 1
12 football 5.0 1
13 basketball 3.0 1
14 hokey 7.0 1
15 football 9.0 1
16 basketball 6.0 1
17 hokey 8.0 1
这篇关于以特定顺序在 pandas 数据框中移动行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!