pandas 与groupby占总数的百分比 [英] Pandas percentage of total with groupby
问题描述
这显然很简单,但是作为一个笨拙的新手,我被卡住了.
This is obviously simple, but as a numpy newbe I'm getting stuck.
我有一个CSV文件,其中包含3列,分别是该办公室的州,办公室ID和销售.
I have a CSV file that contains 3 columns, the State, the Office ID, and the Sales for that office.
我想计算给定状态下每个办公室的销售额百分比(每个州的所有百分比总计为100%).
I want to calculate the percentage of sales per office in a given state (total of all percentages in each state is 100%).
df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
'office_id': range(1, 7) * 2,
'sales': [np.random.randint(100000, 999999)
for _ in range(12)]})
df.groupby(['state', 'office_id']).agg({'sales': 'sum'})
这将返回:
sales
state office_id
AZ 2 839507
4 373917
6 347225
CA 1 798585
3 890850
5 454423
CO 1 819975
3 202969
5 614011
WA 2 163942
4 369858
6 959285
我似乎无法弄清楚如何达到" groupby
的state
级别以总计整个state
的sales
来计算分数.
I can't seem to figure out how to "reach up" to the state
level of the groupby
to total up the sales
for the entire state
to calculate the fraction.
推荐答案
Paul H的答案是正确的,您将拥有来创建第二个groupby
对象,但是您可以以更简单的方式计算百分比-只需groupby
state_office
并将sales
列除以其和即可.复制Paul H答案的开头:
Paul H's answer is right that you will have to make a second groupby
object, but you can calculate the percentage in a simpler way -- just groupby
the state_office
and divide the sales
column by its sum. Copying the beginning of Paul H's answer:
# From Paul H
import numpy as np
import pandas as pd
np.random.seed(0)
df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
'office_id': list(range(1, 7)) * 2,
'sales': [np.random.randint(100000, 999999)
for _ in range(12)]})
state_office = df.groupby(['state', 'office_id']).agg({'sales': 'sum'})
# Change: groupby state_office and divide by sum
state_pcts = state_office.groupby(level=0).apply(lambda x:
100 * x / float(x.sum()))
返回:
sales
state office_id
AZ 2 16.981365
4 19.250033
6 63.768601
CA 1 19.331879
3 33.858747
5 46.809373
CO 1 36.851857
3 19.874290
5 43.273852
WA 2 34.707233
4 35.511259
6 29.781508
这篇关于 pandas 与groupby占总数的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!