如何为每个组生成唯一的id和sub_id [英] How to generate unique id and sub_id for each group
本文介绍了如何为每个组生成唯一的id和sub_id的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的目标是为每个组(u_uuid和p_uuid)生成一个 id (id轨迹)和一个 sub id (在轨迹下).
My goal is to generate an id (id trajectory) and a sub id (under trajectory) for each group (u_uuid and p_uuid).
我尝试了 ngroup 功能,但没有用
I tried the ngroup function and it didn't work
data = [
{'u_uuid': 110, 'p_uuid': 'aaa', 'mode': 'walk', 'dest': 'work'},
{'u_uuid': 110, 'p_uuid': 'aaa', 'mode': 'walk', 'dest': 'work'},
{'u_uuid': 110, 'p_uuid': 'aaa', 'mode': 'bus', 'dest': 'work'},
{'u_uuid': 110, 'p_uuid': 'aaa', 'mode': 'bus', 'dest': 'work'},
{'u_uuid': 110, 'p_uuid': 'aaa', 'mode': 'walk', 'dest': 'work'},
{'u_uuid': 110, 'p_uuid': 'bbb', 'mode': 'walk', 'dest': 'home'},
{'u_uuid': 110, 'p_uuid': 'bbb', 'mode': 'bus', 'dest': 'home'},
{'u_uuid': 110, 'p_uuid': 'bbb', 'mode': 'bus', 'dest': 'home'},
{'u_uuid': 110, 'p_uuid': 'bbb', 'mode': 'walk', 'dest': 'home'}]
df = pd.DataFrame(data)
df['id'] = df.groupby(['u_uuid', 'p_uuid', 'dest']).ngroup()
df['sub_id'] = df.groupby(['u_uuid', 'p_uuid', 'mode']).ngroup()
我的数据框:
我在寻找什么
推荐答案
使用:
s1 = df.groupby(['u_uuid', 'p_uuid', 'dest'],sort=False).ngroup().add(1)
s2 = df.groupby(['u_uuid','p_uuid',
df['mode'].ne(df2['mode'].shift()).cumsum()],sort=False).ngroup()
df['sub_id']=s2.sub(s2.where(s1.ne(s1.shift())).ffill()).add(1).astype(int)
df['id']=s1
print(df)
u_uuid p_uuid mode dest sub_id id
0 110 aaa walk work 1 1
1 110 aaa walk work 1 1
2 110 aaa bus work 2 1
3 110 aaa bus work 2 1
4 110 aaa walk work 3 1
5 110 bbb walk home 1 2
6 110 bbb bus home 2 2
7 110 bbb bus home 2 2
8 110 bbb walk home 3 2
这篇关于如何为每个组生成唯一的id和sub_id的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文