按特定行计算特定列的平均值 [英] Calculating mean of a specific column by specific rows

查看:70
本文介绍了按特定行计算特定列的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像图片中的数据框.

I have a dataframe that looks like in the pictures.

现在,我想添加一个新列,该列将显示每天的平均功率(假设数据每 5 分钟采样一次),但单独显示为 day_or_night(列中的 day = 0,night =1).我已经走了这么远:

Now, I want to add a new column that will show the average of power for each day (given the data is sampled every 5 minutes), but separately for when it is day_or_night (day = 0 in the column, night = 1). I've gotten this far:

train['avg_by_day'][train['day_or_night']==1] = train['power'][train['day_or_night']==1].mean()train['avg_by_day'][train['day_or_night']==0] = train['power'][train['day_or_night']==0].mean()

但这只是添加了对应于白天的所有功率值的平均值,或者类似的 - 夜晚,这不是我所追求的:分别为每个白天/夜晚的特定平均值.

but this just adds the average of all the power values that correspond to day, or similarly - night, which isn't what I'm after: a specific average for each day/night separately.

我需要这样的东西:train['avg_by_day'] == train.power.mean() when day == 1 and day_or_night == 1,这对于每一天.

I need something like: train['avg_by_day'] == train.power.mean() when day == 1 and day_or_night == 1, and this for each day.

推荐答案

所以你想按 dayday_or_night 对数据框进行分组,并创建一个平均 <每组的代码>功率值:

So you want to group the dataframe by day and day_or_night and create a new column with mean power values for each group:

train['avg_by_day'] = train.groupby(['day','day_or_night'])['power']\
                           .transform('mean')

也许你还应该在分组列中包含 yearmonth ,否则它会将每个月的第一天分组在一起,第二天也是如此

Maybe you should also include year and month in the grouping columns because otherwise it's going to group the 1st day of every month together, same for the 2nd day and so on.

这篇关于按特定行计算特定列的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆