pandas :堆叠DataFrame的一列 [英] Pandas: Unstacking One Column of a DataFrame

查看:41
本文介绍了 pandas :堆叠DataFrame的一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在我的Pandas DataFrame中拆开一列. DataFrame由日期"索引,我想拆开国家"列,因此每个国家/地区都是其自己的列.当前的熊猫DF看起来像这样:

I want to unstack one column in my Pandas DataFrame. The DataFrame is indexed by the 'Date' and I want to unstack the 'Country' column so each Country is its own column. The current pandas DF looks like this:

             Country   Product      Flow Unit  Quantity  
Date                                                         
2002-01-31   FINLAND  KEROSENE  TOTEXPSB  KBD    3.8129     
2002-01-31    TURKEY  KEROSENE  TOTEXPSB  KBD    0.2542     
2002-01-31  AUSTRALI  KEROSENE  TOTEXPSB  KBD   12.2787     
2002-01-31    CANADA  KEROSENE  TOTEXPSB  KBD    5.1161     
2002-01-31        UK  KEROSENE  TOTEXPSB  KBD   12.2013     

当我使用df.pivot时,出现以下错误"ReshapeError:索引包含重复的条目,无法重塑"这是正确的,因为我正在查看每个国家同时报告的日期.我想要拆开国家/地区"列,这样每个月只能显示一个日期.

When I use df.pivot I get the following error "ReshapeError: Index contains duplicate entries, cannot reshape" This is true since I'm looking at a Dates that are reported at the same time by each country. What I would like is to unstack the 'Country Column so only one Date would show for each month.

像这样的Date这样的DataFrame标头仍然是索引:

the DataFrame headers like this Date would still be the index:

Date        FINLAND TURKEY  AUSTRALI  CANADA Flow      Unit

2002-01-31  3.8129  0.2542  12.2787   5.1161 TOTEXPSB   KBD

我已经为此工作了一段时间,但是我什么都没走,所以任何方向或见识都将是很好的.

I have worked on this for a while and I'm not getting anywhere so any direction or insight would be great.

另外,请注意,您只会看到DataFrame的头部,因此多年的数据都采用这种格式.

Also, note you are only seeing the head of the DataFrame so years of Data is in this format.

谢谢

道格拉斯

推荐答案

如果可以放下ProductUnitFlow,那么它应该和

If you can drop Product, Unit, and Flow then it should be as easy as

df.reset_index().pivot(columns='Country', index='Date', values='Quantity')

给予

Country  AUSTRALI    CANADA  FINLAND TURKEY  UK
Date                    
2002-01-31   12.2787     5.1161  3.8129  0.2542  12.2013

这篇关于 pandas :堆叠DataFrame的一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆