取消融化 pandas 数据框 [英] Unmelt Pandas DataFrame

查看:62
本文介绍了取消融化 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有两个id变量的pandas数据框:

I have a pandas dataframe with two id variables:

df = pd.DataFrame({'id': [1,1,1,2,2,3], 
               'num': [10,10,12,13,14,15],
               'q': ['a', 'b', 'd', 'a', 'b', 'z'],
               'v': [2,4,6,8,10,12]})

   id  num  q   v
0   1   10  a   2
1   1   10  b   4
2   1   12  d   6
3   2   13  a   8
4   2   14  b  10
5   3   15  z  12

我可以使用以下方法来旋转桌子

I can pivot the table with:

df.pivot('id','q','v')

最后得到一些接近的东西:

And end up with something close:

q    a   b   d   z
id                
1    2   4   6 NaN
2    8  10 NaN NaN
3  NaN NaN NaN  12

但是,我真正想要的是(原始未融化形式):

However, what I really want is (the original unmelted form):

id   num   a   b   d   z               
1    10   2   4 NaN NaN
1    12 NaN NaN   6 NaN  
2    13   8 NaN NaN NaN
2    14 NaN  10 NaN NaN
3    15 NaN NaN NaN  12

换句话说:

  1. 'id'和'num'是我的索引(通常,我只看到过'id'或'num'是索引,但由于我要检索原始的未融合形式,因此我需要两者都使用)
  2. 'q'是我的专栏
  3. 'v'是表中的值

更新

我从但是,格式与我上面想要的格式不完全相同.

However, the format is not quite the same as what I want above.

推荐答案

您真的很熟.只需将列索引重命名为无",就可以拥有所需的内容.

You're really close slaw. Just rename your column index to None and you've got what you want.

df2 = df.pivot_table(index=['id','num'], columns='q')
df2.columns = df2.columns.droplevel().rename(None)
df2.reset_index().fillna("null").to_csv("test.csv", sep="\t", index=None)

注意,默认情况下,"v"列应为数字,以便可以汇总.否则,Pandas将出现以下错误:

Note that the the 'v' column is expected to be numeric by default so that it can be aggregated. Otherwise, Pandas will error out with:

DataError: No numeric types to aggregate

要解决此问题,您可以使用自定义lambda函数来指定自己的聚合函数:

To resolve this, you can specify your own aggregation function by using a custom lambda function:

df2 = df.pivot_table(index=['id','num'], columns='q', aggfunc= lambda x: x)

这篇关于取消融化 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆