根据索引和标签转换数据框 [英] transform dataframe according to index and labels
本文介绍了根据索引和标签转换数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个看起来像这样的数据框:
I have a dataframe that looks something like this:
ID | TEXT | LABEL|
5 | blab | 0
5 | blub | 0
5 | gray | 0
4 | rose | 1
4 | work | 1
4 | app | 1
3 | car | 0
3 | ink | 0
1 | pink | 0
我正在努力将其转换为如下形式:
And I'm struggling to transform it to look like this:
ID | TEXT | TEXT| TEXT | LABEL|
5 | blab | blub| gray | 0
4 | rose | work| app | 1
3 | car | | | 0
1 | pink | | | 0
我现在已经尝试了 df.T
和 df.pivot()
,但是我似乎做得不好,对您有所帮助.
I have tried df.T
and df.pivot()
for now but I can't seem to get it right - any help is appreciated.
推荐答案
这类似于透视两列.基本上,您需要在透视之前枚举组中的行:
This is similar to pivoting with two columns. Basically, you need to enumerate the rows within the groups before pivot:
# maybe groupby on `ID` is enough, depending on your data
(df.assign(col=df.groupby(['ID','LABEL']).cumcount())
.pivot_table(index=['ID','LABEL'], columns='col',
values='TEXT', aggfunc='first')
.add_prefix('TEXT_')
.reset_index()
)
或类似地使用 set_index().unstack()
:
(df.set_index(['ID','LABEL', df.groupby(['ID']).cumcount()])
['TEXT'].unstack()
.add_prefix('TEXT_')
.reset_index()
)
输出:
col ID LABEL TEXT_0 TEXT_1 TEXT_2
0 1 0 pink NaN NaN
1 3 0 car ink NaN
2 4 1 rose work app
3 5 0 blab blub gray
这篇关于根据索引和标签转换数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文