将多列分组,并通过考虑每列的开始和结束( pandas )来为每列分配值 [英] Groupby of multiple columns and assigning values to each by considering start and end of each (Pandas)
本文介绍了将多列分组,并通过考虑每列的开始和结束( pandas )来为每列分配值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个看起来像这样的datframe
I've got a datframe that looks like that
df1
v w x y
4 0 1 a b
5 0 1 a a
_________________
6 0 2 a b
_________________
2 0 3 a b
- - - - - - - - -
3 1 2 a b
_________________
15 1 3 a b
12 1 3 b b
_________________
13 1 1 a b
- - - - - - - - -
15 3 1 a b
14 3 1 b a
8 3 1 a b
9 3 1 a a
因此df1按v和w分组(行),并与另一个包含x和y的df合并. 我需要一个新的z列,它在以下条件下从x和y中选择合适的组:
so df1 were grouped (lines) by v and w and merged with another df which contained x and y. I need a new column z which picks the right group out of x and y with the following conditions:
- 在每个子组"V"(虚线)中,第一组应为"x"(组中x始终以"a"开头,y始终以"b"开头)
- 根据每组(a或b)的结尾字母,下一组应以b(y列)或a(x列)开头
- 如果两个组都以相同的字母结尾,请从"x"中选择下一个组
应如下所示:
df1
v w x y z
4 0 1 a b a
5 0 1 a a a
_____________________
6 0 2 a b b
_____________________
2 0 3 a b a
- - - - - - - - - -- -
3 1 2 a b a
_____________________
15 1 3 a b b
12 1 3 b b b
_____________________
13 1 1 a b a
- - - - - - - - - -
15 3 1 a b a
14 3 1 b a b
8 3 1 a b a
9 3 1 a a a
因此,在'v'个子组中,一组的最后一个字母和下一个组的第一个字母基本上应该不同. 这是可以理解的,有人可以帮助我吗?
so basically last letter of a group and first letter of next group within subgroups of 'v' should be different. Is that understandable and could anyone help me?
推荐答案
df=df.reset_index(drop=True)
s=pd.DataFrame(np.sort(df[['x','y']],axis=1),index=df.index)[1].iloc[::-1].ne('b').cumsum()
df.groupby([df.v,df.w,s]).ngroup()
0 0
1 0
2 1
3 2
4 4
5 5
6 5
7 3
8 6
9 6
10 6
11 6
dtype: int64
这篇关于将多列分组,并通过考虑每列的开始和结束( pandas )来为每列分配值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文