在groupby聚合之后指定列顺序 [英] Specifying column order following groupby aggregation

查看:98
本文介绍了在groupby聚合之后指定列顺序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的年龄,身高和体重列的顺序随着代码的每次运行而变化.我需要使agg列的顺序保持静态,因为最终我会根据列位置引用此输出文件.如何确保每次都按相同顺序输出年龄,身高和体重?

The ordering of my age, height and weight columns is changing with each run of the code. I need to keep the order of my agg columns static because I ultimately refer to this output file according to the column locations. What can I do to make sure age, height and weight are output in the same order every time?

d = pd.read_csv(input_file, na_values=[''])
df = pd.DataFrame(d)
df.index_col = ['name', 'address']

df_out = df.groupby(df.index_col).agg({'age':np.mean, 'height':np.sum, 'weight':np.sum})
df_out.to_csv(output_file, sep=',')

推荐答案

我认为您可以使用子集:

I think you can use subset:

df_out = df.groupby(df.index_col)
           .agg({'age':np.mean, 'height':np.sum, 'weight':np.sum})[['age','height','weight']]

还可以使用pandas函数:

df_out = df.groupby(df.index_col)
           .agg({'age':'mean', 'height':sum, 'weight':sum})[['age','height','weight']]

示例:

df = pd.DataFrame({'name':['q','q','a','a'],
                   'address':['a','a','s','s'],
                   'age':[7,8,9,10],
                   'height':[1,3,5,7],
                   'weight':[5,3,6,8]})

print (df)
  address  age  height name  weight
0       a    7       1    q       5
1       a    8       3    q       3
2       s    9       5    a       6
3       s   10       7    a       8
df.index_col = ['name', 'address']
df_out = df.groupby(df.index_col)
           .agg({'age':'mean', 'height':sum, 'weight':sum})[['age','height','weight']]

print (df_out)
              age  height  weight
name address                     
a    s        9.5      12      14
q    a        7.5       4       8

根据建议进行修改-添加 reset_index ,如果也需要索引值,则此处as_index=False不起作用:

EDIT by suggestion - add reset_index, here as_index=False does not work if need index values too:

df_out = df.groupby(df.index_col)
           .agg({'age':'mean', 'height':sum, 'weight':sum})[['age','height','weight']]
           .reset_index()

print (df_out)
  name address  age  height  weight
0    a       s  9.5      12      14
1    q       a  7.5       4       8

这篇关于在groupby聚合之后指定列顺序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆