从数据列创建唯一索引 [英] Create a Unique Index from Column of Data

查看:68
本文介绍了从数据列创建唯一索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有唯一记录的Pandas DataFrame,但是我需要根据其中一列创建一个唯一键.下面是示例数据,我尝试通过遍历数据并将计数增加一来创建第二列.我的计划是将两者结合在一起以创建唯一密钥.

I have a Pandas DataFrame that has unique records, but I need to create a unique key based on one of the columns. Below is sample data and my attempt to create a second column by iterating through the data and increasing the count by one. My plan was to join the two to create the unique key.

问题: 有没有更好的方法? 我的方法有什么缺点?

Question: Is there a better approach? What is flawed with my approach?

import pandas as pd
import numpy as np

d = {'subid': {0: '327598650129611740', 1: '327598650129611740', 2: '327559921352747760', 3: '327676431535405027', 4: '327676431535405027', 5: '327676431535405027', 6: '327662567602840733', 7: '327778468325442201', 8: '327777161261272775', 9: '327777161261272775'}}

df = pd.DataFrame(d)
old_index = 0
child_no = 1
for subid, row in df.iterrows():
    if subid == old_index:
    df['child_no'] = child_no + 1
    old_index = subid
    child_no = child_no + 1
else:
    child_no = 1
    df['child_no'] = child_no
    old_index = subid

df


subid               child_no
0   327598650129611740  1
1   327598650129611740  1
2   327559921352747760  1
3   327676431535405027  1
4   327676431535405027  1
5   327676431535405027  1
6   327662567602840733  1
7   327778468325442201  1
8   327777161261272775  1
9   327777161261272775  1

所需结果

subid              child_no
0   327598650129611740  1
1   327598650129611740  2
2   327559921352747760  1
3   327676431535405027  1
4   327676431535405027  2
5   327676431535405027  3
6   327662567602840733  1
7   327778468325442201  1
8   327777161261272775  1
9   327777161261272775  2

任何帮助将不胜感激.

推荐答案

您可以在groupby,然后调用

You can groupby on 'subid' and then call cumcount and add 1 as it starts from 0:

In [30]:
df['child_no'] = df.groupby('subid').cumcount()+1
df
Out[30]:
                subid  child_no
0  327598650129611740         1
1  327598650129611740         2
2  327559921352747760         1
3  327676431535405027         1
4  327676431535405027         2
5  327676431535405027         3
6  327662567602840733         1
7  327778468325442201         1
8  327777161261272775         1
9  327777161261272775         2

这篇关于从数据列创建唯一索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆