具有多个 aggfunc 总和的数据透视表并规范化一列 [英] Pivot table with multiple aggfunc sum and normalize one column

查看:108
本文介绍了具有多个 aggfunc 总和的数据透视表并规范化一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据透视表.我找不到如何添加两个 agg func 的解决方案:sum 和 percent(与总和的比例)

I have a pivot table. I can't find solution how to add two agg func: sum and percent (proportion to total sum)

table = pd.pivot_table(natnl_valyuta, values='vsego_zadoljennost', index=['koridor_procent'],
                columns=['yur_fiz', 'srok'], aggfunc=np.sum, margins= True)

数据透视表的结果

我还必须计算占总金额的百分比

I have to calculate also proportion to total sum as percentages

预期输出表

推荐答案

为了我的测试,我创建了以下代理"DataFrame:

For my tests I created the following "surrogate" DataFrame:

txt = ''',FL,FL,YUL,YUL
,1-Kpatk,3-Dolg,1-Kpatk,3-Dolg
0-5,            0,  469532,       0, 3421599
10-15,          2,  342485,    3394, 1084686
16-20,        349,  419492,  131095, 1578722
20 i bolee, 33941,  482238,  167077,  343972
6-10,         309, 1688537,   16479,  486425'''
table = pd.read_csv(io.StringIO(txt), header=[0,1], skipinitialspace=True,
    index_col=0, dtype={0:'object'})
table['All'] = table.sum(axis=1)
table.index.name = 'koridor_procent'
table.columns.set_names(['yur_fiz', 'srok'],inplace=True)
table = table.append(table.sum(axis=0).rename('All'))

创建它,打印并检查它是否与结果相同你的数据透视表.

Create it, print and check whether it is just the same as the result of your pivot_table.

您的任务可以通过以下步骤执行:

Your task can be performed in the following steps:

  1. 要获得正确的行顺序,请创建一个分类索引并用它重新索引表:

  1. To have proper order of rows, create a Categorical index and reindex the table with it:

cats = ['0-5', '6-10', '10-15', '16-20', '20 i bolee', 'All']
table = table.reindex(pd.CategoricalIndex(cats, categories=cats).rename(table.index.name))

  • 按列索引的顶级对列重新排序:

  • Reorder columns by the top level of the column index:

    table = table.sort_index(axis=1, level=1, ascending=False)
    table = table.reindex(['YUL', 'FL', 'All'], level=0, axis=1)
    

  • 生成百分比列:

  • Generate percentage columns:

    pctCols = []
    for colName, col in table.iteritems():
        if colName[0] != 'All':
            pctCol = (col / col.iloc[-1] * 100).round(1).astype('str') + '%'
            pctCol.name = (colName[0], 'dola')
            pctCols.append(pctCol)
    

  • 插入百分比列:

  • Inset percentage columns:

    pos = 1
    for col in pctCols:
        table.insert(pos, column=col.name, value=col, allow_duplicates=True)
        pos += 2
    

  • 对于上面的测试数据,我得到如下结果:

    For the above test data, I got the following result:

    yur_fiz              YUL                               FL                               All
    srok              3-Dolg    dola 1-Kpatk    dola   3-Dolg    dola 1-Kpatk    dola           
    koridor_procent                                                                             
    0-5              3421599   49.5%       0    0.0%   469532   13.8%       0    0.0%   3891131 
    6-10              486425    7.0%   16479    5.2%  1688537   49.6%     309    0.9%   2191750 
    10-15            1084686   15.7%    3394    1.1%   342485   10.1%       2    0.0%   1430567 
    16-20            1578722   22.8%  131095   41.2%   419492   12.3%     349    1.0%   2129658 
    20 i bolee        343972    5.0%  167077   52.5%   482238   14.2%   33941   98.1%   1027228 
    All              6915404  100.0%  318045  100.0%  3402284  100.0%   34601  100.0%  10670334 
    

    这篇关于具有多个 aggfunc 总和的数据透视表并规范化一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆