替换分组依据和条件中的列值 [英] Replace column values within a groupby and condition

查看:63
本文介绍了替换分组依据和条件中的列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,我想查找组中一列的最小值,然后根据该行,更新其他一些列的值.

I have a dataframe that I want to find the minimum value of a column within a group, and then based on that row, update the values of some of the other columns.

以下代码可以满足我的要求:

The following code does what I want:

import pandas as pd

df = pd.DataFrame({'ID': [1,1,1,2,2,2,],
                   'Albedo': [0.2, 0.4, 0.5, 0.3, 0.5, 0.1],
                   'Temp' : [20, 30, 15, 40, 10, 5],
                   'Precip': [200, 100, 150, 60, 110, 45],
                   'Year': [1950, 2000, 2004, 1999, 1976, 1916]})

#cols to replace values for
cols = ['Temp', 'Precip', 'Year']

final = pd.DataFrame()


for key, grp in df.groupby(['ID']):

    #minimum values based on year
    replace = grp.loc[grp['Year'] == grp['Year'].min()]

    #replace the values
    for col in cols:
        grp[col] = replace[col].unique()[0]  

    #append the values
    final = final.append(grp)
print(final)

产生:

   Albedo  ID  Precip  Temp  Year
0     0.2   1     200    20  1950
1     0.4   1     200    20  1950
2     0.5   1     200    20  1950
3     0.3   2      45     5  1916
4     0.5   2      45     5  1916
5     0.1   2      45     5  1916

因此从 ID 的每个组中,我找到了最小的 Year ,然后更新了 Temp Precip 和其他行的年份.这似乎循环很多,但我想知道是否还有更好的方法.

so within each group from ID I find the minimum Year and then update Temp, Precip and the Year of the other rows. This seems like a lot of looping and I am wondering if there is a better way though.

推荐答案

ID + transform + idxmin Year 上获取一系列索引.将这些索引传递到 loc 以获得结果.

Use groupby on ID + transform + idxmin on Year to get a series of indices. Pass these indices to loc to get your result.

(df.iloc[df.groupby('ID')['Year'].transform('idxmin')]
   .reset_index(drop=True)
   .assign(Albedo=df['Albedo']))

   Albedo  ID  Precip  Temp  Year
0     0.2   1     200    20  1950
1     0.4   1     200    20  1950
2     0.5   1     200    20  1950
3     0.3   2      45     5  1916
4     0.5   2      45     5  1916
5     0.1   2      45     5  1916

这篇关于替换分组依据和条件中的列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆