pandas 根据“名称"列创建“外国ID"列 [英] Pandas create Foreign ID column based on Name column

查看:82
本文介绍了 pandas 根据“名称"列创建“外国ID"列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如,我有一个简单的数据框:

I have a simple dataframe like this for example:

df = pd.DataFrame({'Name': ['John Doe', 'Jane Smith', 'John Doe', 'Jane Smith','Jack Dawson','John Doe']})
df:
        Name
    0   John Doe
    1   Jane Smith
    2   John Doe
    3   Jane Smith
    4   Jack Dawson
    5   John Doe

我想添加一列['foreign_key'],该列为每个唯一名称分配一个唯一ID(但具有相同名称的行应具有相同的'foreign_key'.因此最终输出如下所示:

I want to add a column ['foreign_key'] that assigns a unique ID to each unique name (but rows with the same name should have the same 'foreign_key'. So the final output looks like:

df:
            Name        Foreign_Key
        0   John Doe    foreignkey1
        1   Jane Smith  foreignkey2
        2   John Doe    foreignkey1
        3   Jane Smith  foreignkey2
        4   Jack Dawson foreignkey3
        5   John Doe    foreignkey1

我正在尝试将 groupby 与自定义函数一起使用被申请;被应用. 所以我的第一步是:

I'm trying to use groupby with a custom function that is applied. So my first step is:

name_groupby = df.groupby('Name')

这就是拆分,接下来是应用和合并.像这个例子一样,文档中似乎没有任何内容,我不确定从这里开始.

So that's the splitting, and next comes the apply and combine. There doesn't appear to be anything in the docs like this example and I'm unsure where to go from here.

我开始应用的自定义函数如下:

The custom function I started to apply looks like this:

def make_foreign_key(groupby_df):
    return groupby_df['Foreign_Key'] = 'foreign_key' + num

任何帮助将不胜感激!

推荐答案

您可以这样做:

pd.merge(
    df,
    pd.DataFrame(df.Name.unique(), columns=['Name']).reset_index().rename(columns={'index': 'Foreign_Key'}),
    on='Name'
)

         Name  Foreign_Key
0    John Doe            0
1    John Doe            0
2  Jane Smith            1
3  Jane Smith            1

这篇关于 pandas 根据“名称"列创建“外国ID"列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆