带列的数据透视表 pandas 的百分比计算 [英] Percentage calculation in pivot table pandas with columns
问题描述
我有一个数据集,其中包含来自不同供应商,位置,日期和产品的多个销售记录. 数据集是这样的:
I have a dataset containing several sells register from different vendors, locations, dates, and products. The data set is like this:
local categoria fabricante tipo consistencia peso pacote ordem vendas_kg
AREA I SABAO ASATP DILUIDO LIQUIDO 1501 A 2000g PLASTICO 1 10
AREA I SABAO TEPOS DILUIDO LIQUIDO 1501 A 2000g PLASTICO 1 20
AREA I SABAO ASATP CAPSULA LIQUIDO 1501 A 2000g PLASTICO 1 20
AREA I SABAO TEPOS CAPSULA LIQUIDO 1501 A 2000g PLASTICO 1 30
AREA I SABAO ASATP DILUIDO LIQUIDO 1501 A 2000g PLASTICO 2 20
AREA I SABAO TEPOS DILUIDO LIQUIDO 1501 A 2000g PLASTICO 2 30
AREA I SABAO ASATP CAPSULA LIQUIDO 1501 A 2000g PLASTICO 2 20
AREA I SABAO TEPOS CAPSULA LIQUIDO 1501 A 2000g PLASTICO 2 30
AREA II SABAO ASATP DILUIDO LIQUIDO 1501 A 2000g PLASTICO 1 10
AREA II SABAO TEPOS DILUIDO LIQUIDO 1501 A 2000g PLASTICO 1 15
AREA II SABAO ASATP CAPSULA LIQUIDO 1501 A 2000g PLASTICO 1 25
AREA II SABAO TEPOS CAPSULA LIQUIDO 1501 A 2000g PLASTICO 1 35
AREA II SABAO ASATP DILUIDO LIQUIDO 1501 A 2000g PLASTICO 2 20
AREA II SABAO TEPOS DILUIDO LIQUIDO 1501 A 2000g PLASTICO 2 25
AREA II SABAO TEPOS CAPSULA LIQUIDO 1501 A 2000g PLASTICO 2 20
AREA II SABAO TEPOS CAPSULA LIQUIDO 1501 A 2000g PLASTICO 2 30
我将使用以下代码来透视此数据集:
I'm pivoting this data set usig the following code:
temp_df = pd.pivot_table(df,index=['local','tipo','ordem'], values=['vendas_kg'] , aggfunc=[np.sum], columns=['fabricante'], fill_values=0, margins=True, margins_name= 'Total')
我得到以下输出:
sum sum
vendas_kg vendas_kg
fabricante ASATP TEPOS Total
local tipo ordem
AREA I DILUIDO 1 10 20 30
2 20 30 50
CAPSULA 1 10 20 30
2 20 30 50
AREA II DILUIDO 1 10 15 25
2 20 25 45
CAPSULA 1 25 35 55
2 20 30 50
我想计算每个['ordem']和每个段的百分比.g. ordem,tipo和本地,例如:
I want to calculate the percentage for each ['ordem'] and for the percentage for each segment.g. ordem,tipo and local, like this:
sum sum
vendas_kg vendas_kg
fabricante ASATP TEPOS % segment Total
local tipo ordem
AREA I DILUIDO 1 33% 66% 50% 30
2 40% 60% 50% 50
CAPSULA 1 33% 66% 50% 30
2 40% 60% 50% 50
AREA II DILUIDO 1 40% 60% 31.25% 25
2 44.44% 55.56% 47.37% 45
CAPSULA 1 43.64% 57.36% 53.63% 55
2 40% 60% 53.63% 50
因此,AREA I DILUIDO 1的总销售额为30,ASTAP销售额占33%,TEPOS 66%,从AREA I 1的总销售额中,DILUIDOs销售额占50%,依此类推.
So the total sales for AREA I DILUIDO 1 is 30, ASATP sales represent 33% of it and TEPOS 66%, and from the total sales of AREA I 1 the DILUIDOs sales represent 50% and so on.
我还想比较['ordem']之间的销售差异,例如段和['fabricante']的百分比增长,并存储在新表中,如下所示:
I also want to compare the sales difference between ['ordem'] e.g. percentage growth of segment and ['fabricante'] and store in a new table like this:
% change in % change in
vendas_kg vendas_kg % change in % change in
fabricante ASATP TEPOS % segment Total
AREA I DILUIDO 1 0 0 0 0
2 +7% -6% 0 20
3 0 0 0 0
AREA I CAPSULA 1 0 0 0 0
2 +7% -6% 0 20
3 0 0 0 0
AREA II DILUIDO 1 0 0 0 0
2 +4.44% -4.44% +16.12% 20
3 0 0 0 0
AREA II CAPSULA 1 0 0 0 0
2 -3.64% +3.64% 0 5
3 0 0 0 0
在过去的5天中,我一直处于这种状态,在['fabricante'] ['tipo']和['local']中有更多类别,因此每个类别都必须适用于两个以上类别. 预先感谢您的帮助,如有疑问,请随时与我联系.
I'm stuck at this over the last 5 days, I have way more categories in ['fabricante']['tipo'] and ['local'] so it must work for more than two categories in each. Thanks for the help in advance, feel free to contact me in case of doubts.
推荐答案
获取百分比:
df_percent = temp_df.iloc[:, [0,1]].apply(lambda x: round(x / x.sum() * 100, 2), axis = 1)
要获取变化,请使用diff
df_diff_percent = df_percent.groupby(level=[0,1]).diff().fillna(0)
sum
vendas_kg
fabricante ASATP TEPOS
local tipo ordem
AREA I CAPSULA 1 0.00 0.00
2 0.00 0.00
DILUIDO 1 0.00 0.00
2 6.67 -6.67
AREA II CAPSULA 1 0.00 0.00
2 -41.67 41.67
这篇关于带列的数据透视表 pandas 的百分比计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!