具有多个 aggfunc 总和的数据透视表并规范化一列 [英] Pivot table with multiple aggfunc sum and normalize one column
问题描述
我有一个数据透视表.我找不到如何添加两个 agg func 的解决方案:sum 和 percent(与总和的比例)
I have a pivot table. I can't find solution how to add two agg func: sum and percent (proportion to total sum)
table = pd.pivot_table(natnl_valyuta, values='vsego_zadoljennost', index=['koridor_procent'],
columns=['yur_fiz', 'srok'], aggfunc=np.sum, margins= True)
我还必须计算占总金额的百分比
I have to calculate also proportion to total sum as percentages
推荐答案
为了我的测试,我创建了以下代理"DataFrame:
For my tests I created the following "surrogate" DataFrame:
txt = ''',FL,FL,YUL,YUL
,1-Kpatk,3-Dolg,1-Kpatk,3-Dolg
0-5, 0, 469532, 0, 3421599
10-15, 2, 342485, 3394, 1084686
16-20, 349, 419492, 131095, 1578722
20 i bolee, 33941, 482238, 167077, 343972
6-10, 309, 1688537, 16479, 486425'''
table = pd.read_csv(io.StringIO(txt), header=[0,1], skipinitialspace=True,
index_col=0, dtype={0:'object'})
table['All'] = table.sum(axis=1)
table.index.name = 'koridor_procent'
table.columns.set_names(['yur_fiz', 'srok'],inplace=True)
table = table.append(table.sum(axis=0).rename('All'))
创建它,打印并检查它是否与结果相同你的数据透视表.
Create it, print and check whether it is just the same as the result of your pivot_table.
您的任务可以通过以下步骤执行:
Your task can be performed in the following steps:
要获得正确的行顺序,请创建一个分类索引并用它重新索引表:
To have proper order of rows, create a Categorical index and reindex the table with it:
cats = ['0-5', '6-10', '10-15', '16-20', '20 i bolee', 'All']
table = table.reindex(pd.CategoricalIndex(cats, categories=cats).rename(table.index.name))
按列索引的顶级对列重新排序:
Reorder columns by the top level of the column index:
table = table.sort_index(axis=1, level=1, ascending=False)
table = table.reindex(['YUL', 'FL', 'All'], level=0, axis=1)
生成百分比列:
Generate percentage columns:
pctCols = []
for colName, col in table.iteritems():
if colName[0] != 'All':
pctCol = (col / col.iloc[-1] * 100).round(1).astype('str') + '%'
pctCol.name = (colName[0], 'dola')
pctCols.append(pctCol)
插入百分比列:
Inset percentage columns:
pos = 1
for col in pctCols:
table.insert(pos, column=col.name, value=col, allow_duplicates=True)
pos += 2
对于上面的测试数据,我得到如下结果:
For the above test data, I got the following result:
yur_fiz YUL FL All
srok 3-Dolg dola 1-Kpatk dola 3-Dolg dola 1-Kpatk dola
koridor_procent
0-5 3421599 49.5% 0 0.0% 469532 13.8% 0 0.0% 3891131
6-10 486425 7.0% 16479 5.2% 1688537 49.6% 309 0.9% 2191750
10-15 1084686 15.7% 3394 1.1% 342485 10.1% 2 0.0% 1430567
16-20 1578722 22.8% 131095 41.2% 419492 12.3% 349 1.0% 2129658
20 i bolee 343972 5.0% 167077 52.5% 482238 14.2% 33941 98.1% 1027228
All 6915404 100.0% 318045 100.0% 3402284 100.0% 34601 100.0% 10670334
这篇关于具有多个 aggfunc 总和的数据透视表并规范化一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!