pandas 数据框:根据现有模式的数量创建新的ID变量 [英] Pandas dataframe : create new ID variable based on number of modalities of an existing one

查看：76 发布时间：2020/5/24 4:03:51 python python-3.x pandas dataframe

本文介绍了 pandas 数据框:根据现有模式的数量创建新的ID变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

数据帧df包含一个ID变量，该变量包含观察组的ID.但是ID值具有空洞"(可以是1,3,4,7，而没有0,2,5,6).

Dataframe df contains an ID variable containing IDs of groups of observations. But the ID values has "holes" (can be 1,3,4,7 without 0,2,5,6).

df = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6 ], 'b': [7, 8 , 9, 10, 11, 12],
                   'id': [1, 4, 4, 7, 3, 1]})

   a   b  id
0  1   7   1
1  2   8   4
2  3   9   4
3  4  10   7
4  5  11   3
5  6  12   1

我的目标是用一个新的变量替换现有的ID变量，从0到我在原始ID变量中拥有的ID的最大数量，例如.

My goal is to replace the existing ID variable with a new one starting from 0 to the the max number of IDs I have in the original ID variable, such as.

df2 = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6 ], 'b': [7, 8 , 9, 10, 11, 12],
                    'id': [0, 2, 2, 3, 1, 0]})

   a   b  id
0  1   7   0
1  2   8   2
2  3   9   2
3  4  10   3
4  5  11   1
5  6  12   0

请问该怎么做?

感谢您的时间！

推荐答案

pd.factorize supports this:

df['id'] = pd.factorize(df['id'], sort=True)[0]

#    a   b  id
# 0  1   7   0
# 1  2   8   2
# 2  3   9   2
# 3  4  10   3
# 4  5  11   1
# 5  6  12   0

这篇关于 pandas 数据框:根据现有模式的数量创建新的ID变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 数据框:根据现有模式的数量创建新的ID变量 [英] Pandas dataframe : create new ID variable based on number of modalities of an existing one

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 数据框:根据现有模式的数量创建新的ID变量 [英] Pandas dataframe : create new ID variable based on number of modalities of an existing one

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭