Python Pandas groupby,排名,然后根据自定义排名分配值 [英] Python Pandas groupby, rank, then assign value based on custom rank

查看:1985
本文介绍了Python Pandas groupby,排名,然后根据自定义排名分配值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题设置

熊猫数据框

df = pd.DataFrame({'Group': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'], 'Subgroup': ['Group 1', 'Group 1', 'Group 1', 'Group 1', 'Group 1', 'Group 1', 'Group 2', 'Group 2', 'Group 2'], 'Keyword': ['kw 1', 'kw 1', 'kw 1', 'kw 2', '+kw +2', 'kw 2', 'kw 3', 'kw 3', 'kw 3'], 'Normalized': ['kw 1', 'kw 1', 'kw 1', 'kw 2', 'kw 2', 'kw 2', 'kw 3', 'kw 3', 'kw 3'], 'Criterion Type': ['Exact', 'Phrase', 'Broad', 'Phrase', 'Broadified', 'Exact', 'Broad', 'Exact', 'Phrase'], 'Max CPC': [1.62, 1.73, 0.87, 1.70, 0.85, 1.60, 0.99, 1.58, 1.68], 'CPC Rank': [2, 1, 3, 1, 3, 2, 3, 2, 1], 'Type Rank': [1, 2, 3, 2, 3, 1, 3, 1, 2]})

这样可以得到列中的列正确点:

This to get the columns in the right spot:

df = df[['Group', 'Subgroup', 'Keyword', 'Normalized', 'Criterion Type', 'Max CPC', 'CPC Rank', 'Type Rank']]

目标

groupby ['Group','Subgroup' ,'Normalized'] ,然后排名 最高CPC s。接下来,我想将与 CPC Rank 相关联的最高CPC 映射到类型排名根据条件类型和我自己的自定义排名确定:
{'Exact':1, 'Phrase':2,'Broadified':3,'Broad':4}

groupby ['Group', 'Subgroup', 'Normalized'], then rank the Max CPCs. Next, I want to map the Max CPC associated to the CPC Rank to the Type Rank which is determined based on Criterion Type and my own custom rank: {'Exact':1, 'Phrase':2, 'Broadified':3, 'Broad':4}

结果将是新CPC 列及其相应的最高每次点击费用

The result would be the New CPC column with its appropriate Max CPC.

推荐答案

import pandas as pd
import numpy as np

df = pd.DataFrame({'Group': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'], 'Subgroup': ['Group 1', 'Group 1', 'Group 1', 'Group 1', 'Group 1', 'Group 1', 'Group 2', 'Group 2', 'Group 2'], 'Keyword': ['kw 1', 'kw 1', 'kw 1', 'kw 2', '+kw +2', 'kw 2', 'kw 3', 'kw 3', 'kw 3'], 'Normalized': ['kw 1', 'kw 1', 'kw 1', 'kw 2', 'kw 2', 'kw 2', 'kw 3', 'kw 3', 'kw 3'], 'Criterion Type': ['Exact', 'Phrase', 'Broad', 'Phrase', 'Broadified', 'Exact', 'Broad', 'Exact', 'Phrase'], 'Max CPC': [1.62, 1.73, 0.87, 1.70, 0.85, 1.60, 0.99, 1.58, 1.68], 'CPC Rank': [2, 1, 3, 1, 3, 2, 3, 2, 1], 'Type Rank': [1, 2, 3, 2, 3, 1, 3, 1, 2]})
df = df[['Group', 'Subgroup', 'Keyword', 'Normalized', 'Criterion Type', 'Max CPC', 'CPC Rank', 'Type Rank']]

#Sort by custom priority based on their Criterion Type
df = df.sort(['Group', 'Subgroup', 'Normalized', 'Type Rank'])
#Reset index and drop old one
df = df.reset_index(drop=True)
print(df)
#Create df1 which is a Series of the Max CPC column in its correctly ranked order
df1 = df.sort(['Group', 'Subgroup', 'Normalized', 'CPC Rank'])['Max CPC']
#Reset index and drop old one
df1 = df1.reset_index(drop=True)
print(df1)

#Add the df1 Series to df and name the column New CPC
df['New CPC'] = df1

print(df)

这是迄今为止这个问题的最有效的解决方案。很难的部分是意识到我可以通过类型等级 sort df $ c>所以条件类型行按排名排序。这意味着我希望最高的最高每次点击费用适用于第一个,第二高最高每次点击费用到第二个,

This is by far the most efficient solution to this problem. The hard part was realizing that I could sort df by the Type Rank so the Criterion Type rows were ordered by their rank. This meant I wanted the highest Max CPC to apply to the first, the second highest Max CPC to the second, and so on.

然后我只需要创建一个最高CPC 系列 CPC排名排序

Then all I had to do was create a Max CPC Series sorted by CPC Rank.

最后,添加这个系列到现有的 df

Lastly, add this Series to the existing df.

这篇关于Python Pandas groupby,排名,然后根据自定义排名分配值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆