根据索引和标签转换数据框 [英] transform dataframe according to index and labels

查看:27
本文介绍了根据索引和标签转换数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的数据框:

I have a dataframe that looks something like this:

ID | TEXT | LABEL|

5  | blab | 0 
5  | blub | 0 
5  | gray | 0 
4  | rose | 1 
4  | work | 1 
4  | app  | 1 
3  | car  | 0 
3  | ink  | 0
1  | pink | 0 

我正在努力将其转换为如下形式:

And I'm struggling to transform it to look like this:

ID | TEXT | TEXT| TEXT | LABEL|
5  | blab | blub| gray | 0 
4  | rose | work| app  | 1
3  | car  |     |      | 0 
1  | pink |     |      | 0 

我现在已经尝试了 df.T df.pivot(),但是我似乎做得不好,对您有所帮助.

I have tried df.T and df.pivot() for now but I can't seem to get it right - any help is appreciated.

推荐答案

这类似于透视两列.基本上,您需要在透视之前枚举组中的行:

This is similar to pivoting with two columns. Basically, you need to enumerate the rows within the groups before pivot:

# maybe groupby on `ID` is enough, depending on your data
(df.assign(col=df.groupby(['ID','LABEL']).cumcount())
   .pivot_table(index=['ID','LABEL'], columns='col', 
                values='TEXT', aggfunc='first')
   .add_prefix('TEXT_')
   .reset_index() 
)

或类似地使用 set_index().unstack():

(df.set_index(['ID','LABEL', df.groupby(['ID']).cumcount()])
   ['TEXT'].unstack()
   .add_prefix('TEXT_')
   .reset_index() 
)

输出:

col  ID  LABEL TEXT_0 TEXT_1 TEXT_2
0     1      0   pink    NaN    NaN
1     3      0    car    ink    NaN
2     4      1   rose   work    app
3     5      0   blab   blub   gray

这篇关于根据索引和标签转换数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆