Python Pandas将行转换为存在多个列的列 [英] Python pandas convert rows to columns where multiple columns exist
本文介绍了Python Pandas将行转换为存在多个列的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含多列的DF,我想将其从行转换为列,我在堆栈溢出中看到的大多数解决方案仅处理2列
I have a DF with multiple columns which I want to convert from rows to columns most solutions I have seen on stack overflow only deal with 2 columns
从DF
PO ID PO Name Region Date Price
1 AA North 07/2016 100
2 BB South 07/2016 200
1 AA North 08/2016 300
2 BB South 08/2016 400
1 AA North 09/2016 500
致DF
PO ID PO Name Region 07/2016 08/2016 09/2016
1 AA North 100 300 500
2 BB South 200 400 NaN
推荐答案
使用 set_index
与如果重复项需要使用 pivot_table
或groupby
:
If duplicates need aggregate function with pivot_table
or groupby
:
print (df)
PO ID PO Name Region Date Price
0 1 AA North 07/2016 100 <-for PO ID;PO Name;Region;Date different Price
1 1 AA North 07/2016 500 <-for PO ID;PO Name;Region;Date different Price
2 2 BB South 07/2016 200
3 1 AA North 08/2016 300
4 2 BB South 08/2016 400
5 1 AA North 09/2016 500
df = df.pivot_table(index=['PO ID','PO Name','Region'],
columns='Date',
values='Price',
aggfunc='mean')
print (df)
Date 07/2016 08/2016 09/2016
PO ID PO Name Region
1 AA North 300.0 300.0 500.0 <-(100+500)/2=300 for 07/2016
2 BB South 200.0 400.0 NaN
df = df.groupby(['PO ID','PO Name','Region', 'Date'])['Price'].mean().unstack()
print (df)
Date 07/2016 08/2016 09/2016
PO ID PO Name Region
1 AA North 300.0 300.0 500.0 <-(100+500)/2=300 for 07/2016
2 BB South 200.0 400.0 NaN
最后一个:
df = df.reset_index().rename_axis(None).rename_axis(None, axis=1)
print (df)
PO ID PO Name Region 07/2016 08/2016 09/2016
0 1 AA North 300.0 300.0 500.0
1 2 BB South 200.0 400.0 NaN
这篇关于Python Pandas将行转换为存在多个列的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文