为 pandas 中的组分配唯一的数字组ID [英] Assign Unique Numeric Group IDs to Groups in Pandas

查看：60 发布时间：2020/5/24 1:18:12 python pandas pandas-groupby

本文介绍了为 pandas 中的组分配唯一的数字组ID的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直遇到必须为数据集中的每个组分配唯一ID的问题.在RNN的零填充，生成图形和许多其他场合时，我已经使用了此方法.

I've consistently run into this issue of having to assign a unique ID to each group in a data set. I've used this when zero padding for RNN's, generating graphs, and many other occasions.

通常可以通过串联每个pd.groupby列中的值来完成.但是，通常情况是，定义组的列数，它们的dtype或值的大小使连接成为一种不必要的解决方案，不必要地占用了内存.

This can usually be done by concatenating the values in each pd.groupby column. However, it is often the case the number of columns that define a group, their dtype, or the value sizes make concatenation an impractical solution that needlessly uses up memory.

我想知道是否有一种简单的方法可以为熊猫中的组分配唯一的数字ID.

I was wondering if there was an easy way to assign a unique numeric ID to groups in pandas.

推荐答案

您只需要seeiespi(或pd.factorize)中的ngroup数据

You just need ngroup data from seeiespi (or pd.factorize)

df.groupby('C').ngroup()
Out[322]: 
0    0
1    0
2    2
3    1
4    1
5    1
6    1
7    2
8    2
dtype: int64

更多选项

pd.factorize(df.C)[0]
Out[323]: array([0, 0, 1, 2, 2, 2, 2, 1, 1], dtype=int64)
df.C.astype('category').cat.codes
Out[324]: 
0    0
1    0
2    2
3    1
4    1
5    1
6    1
7    2
8    2
dtype: int8

这篇关于为 pandas 中的组分配唯一的数字组ID的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为 pandas 中的组分配唯一的数字组ID [英] Assign Unique Numeric Group IDs to Groups in Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

为 pandas 中的组分配唯一的数字组ID [英] Assign Unique Numeric Group IDs to Groups in Pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭