Python Pandas将行转换为存在多个列的列 [英] Python pandas convert rows to columns where multiple columns exist

查看:317
本文介绍了Python Pandas将行转换为存在多个列的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含多列的DF,我想将其从行转换为列,我在堆栈溢出中看到的大多数解决方案仅处理2列

I have a DF with multiple columns which I want to convert from rows to columns most solutions I have seen on stack overflow only deal with 2 columns

从DF

PO ID   PO Name Region  Date    Price
1       AA      North   07/2016 100
2       BB      South   07/2016 200
1       AA      North   08/2016 300
2       BB      South   08/2016 400
1       AA      North   09/2016 500

致DF

PO ID   PO Name Region  07/2016 08/2016 09/2016
1       AA      North   100     300     500
2       BB      South   200     400     NaN

推荐答案

使用 set_index 如果重复项需要使用 pivot_table groupby:

If duplicates need aggregate function with pivot_table or groupby:

print (df)
   PO ID PO Name Region     Date  Price
0      1      AA  North  07/2016    100 <-for PO ID;PO Name;Region;Date different Price
1      1      AA  North  07/2016    500 <-for PO ID;PO Name;Region;Date different Price
2      2      BB  South  07/2016    200
3      1      AA  North  08/2016    300
4      2      BB  South  08/2016    400
5      1      AA  North  09/2016    500

df = df.pivot_table(index=['PO ID','PO Name','Region'], 
                    columns='Date', 
                    values='Price', 
                    aggfunc='mean')
print (df)
Date                  07/2016  08/2016  09/2016
PO ID PO Name Region                           
1     AA      North     300.0    300.0    500.0 <-(100+500)/2=300 for 07/2016
2     BB      South     200.0    400.0      NaN


df = df.groupby(['PO ID','PO Name','Region', 'Date'])['Price'].mean().unstack()
print (df)
Date                  07/2016  08/2016  09/2016
PO ID PO Name Region                           
1     AA      North     300.0    300.0    500.0 <-(100+500)/2=300 for 07/2016
2     BB      South     200.0    400.0      NaN

最后一个:

df = df.reset_index().rename_axis(None).rename_axis(None, axis=1)
print (df)
   PO ID PO Name Region  07/2016  08/2016  09/2016
0      1      AA  North    300.0    300.0    500.0
1      2      BB  South    200.0    400.0      NaN

这篇关于Python Pandas将行转换为存在多个列的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆