确定 pandas 数据框中每隔一列的最大值 [英] determine column maximum value per another column in pandas dataframe

查看:53
本文介绍了确定 pandas 数据框中每隔一列的最大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含位置ID,商店名称和商店收入.我想确定单位面积收入最高的商店

I have a dataframe contains location Id, store name and store revenue. I want to determine the store that has the maximum revenue per area

我为此编写了代码,但不确定是否有更好的方法来处理这种情况

I wrote a code for that, but not sure if there is a better way to handle this case

import pandas as pd    
dframe=pd.DataFrame({"Loc_Id":[1,2,2,1,2,1,3,3],"Store":["A","B","C","B","D","B","A","C"],
                 "Revenue":[50,70,45,35,80,70,90,65]})

#group by location id, then save max per location in new column
dframe["max_value"]=dframe.groupby("Loc_Id")["Revenue"].transform(max)

#create new column by checking if the revenue equal to max revenue
dframe["is_loc_max"]=dframe.apply(lambda x: 1 if x["Revenue"]==x["max_value"] else 0,axis=1)

#drop the intermediate column 
dframe.drop(columns=["max_value"],inplace=True)

,这是必需的输出:

and This is the required output:

有没有更好的方法来获取此输出

is there a better way to get this output

推荐答案

通过

Create boolean mask by compare by eq (==) and convert it to integers - 0, 1 to False, True:

s = dframe.groupby("Loc_Id")["Revenue"].transform('max')
dframe["max_value"]= s.eq(dframe["Revenue"]).astype(int)
print (dframe)
   Loc_Id Store  Revenue  max_value
0       1     A       50          0
1       2     B       70          0
2       2     C       45          0
3       1     B       35          0
4       2     D       80          1
5       1     B       70          1
6       3     A       90          1
7       3     C       65          0

这篇关于确定 pandas 数据框中每隔一列的最大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆