pandas :创建单一尺寸和将多列分组后对列求和 [英] pandas: create single size & sum columns after group by multiple columns
问题描述
我有一个数据框,在这里我要对3列进行分组,并汇总数字列的总和和大小.运行代码后
I have a dataframe where I am doing groupby on 3 columns and aggregating the sum and size of the numerical columns. After running the code
df = pd.DataFrame.groupby(['year','cntry', 'state']).agg(['size','sum'])
我收到类似以下的内容:
I am getting something like below:
现在,我想从主列中拆分大小子列,并仅创建单个大小的列,但希望将总和列保留在主列标题下.我尝试了不同的方法,但没有成功. 这些是我尝试过的方法,但是无法使事情对我有用:
Now I want to split my size sub columns from main columns and create only single size column but want to keep the sum columns under main column headings. I have tried different approaches but not successful. These are the methods I have tried but unable to get things working for me:
如果有人可以帮助我,将不胜感激.
Will be grateful to if anyone can help me with this one.
此致
推荐答案
设置
Setup
d1 = pd.DataFrame(dict(
year=np.random.choice((2014, 2015, 2016), 100),
cntry=['United States' for _ in range(100)],
State=np.random.choice(states, 100),
Col1=np.random.randint(0, 20, 100),
Col2=np.random.randint(0, 20, 100),
Col3=np.random.randint(0, 20, 100),
))
df = d1.groupby(['year', 'cntry', 'State']).agg(['size', 'sum'])
df
答案
最简单的方法是只在groupby
Answer
Easiest way would have been to only run size
after groupby
d1.groupby(['year', 'cntry', 'State']).size()
year cntry State
2014 United States California 10
Florida 9
Massachusetts 8
Minnesota 5
2015 United States California 9
Florida 7
Massachusetts 4
Minnesota 11
2016 United States California 8
Florida 8
Massachusetts 11
Minnesota 10
dtype: int64
要使用计算出的df
df.xs('size', axis=1, level=1)
如果每列的size
不同,这将很有用.但是因为size
列与['Col1', 'Col2', 'Col3']
相同,所以我们可以这样做
And that would be useful if the size
were different for each column. But because the size
column is the same for ['Col1', 'Col2', 'Col3']
, we can just do
df[('Col1', 'size')]
year cntry State
2014 United States California 10
Florida 9
Massachusetts 8
Minnesota 5
2015 United States California 9
Florida 7
Massachusetts 4
Minnesota 11
2016 United States California 8
Florida 8
Massachusetts 11
Minnesota 10
Name: (Col1, size), dtype: int64
组合视图1
Combined View 1
pd.concat([df[('Col1', 'size')].rename('size'),
df.xs('sum', axis=1, level=1)], axis=1)
组合视图2
Combined View 2
pd.concat([df[('Col1', 'size')].rename(('', 'size')),
df.xs('sum', axis=1, level=1, drop_level=False)], axis=1)
这篇关于 pandas :创建单一尺寸和将多列分组后对列求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!