pandas :Groupby,并在组内使用条件进行迭代? [英] Pandas: Groupby and iterate with conditionals within groups?

查看:54
本文介绍了 pandas :Groupby,并在组内使用条件进行迭代?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个比较棘手的迭代问题,我很难实施.

I have a relatively tricky iteration question that I am having trouble implementing.

我有一个数据帧,其前6列如下所示.我正在尝试编写一个在组内进行迭代的函数-特别是按类别和级别对行进行分组-然后如果该行与组中的其他任何行都满足两个条件,则生成一个新变量>.我想产生机会吗?下面的二进制指示符,如果与条件匹配,则等于1.原因列仅提供我要生成的结果的说明.

I have a dataframe with the first 6 columns seen below. I am trying to write a function that iterates within groups -- specifically grouping rows by Category and Level -- and then generates a new variable if two conditions are met for that row vs. any other row in the group. I'd like to generate the Opportunity? binary indicator below, where it equals 1 if it matches the condition. The Reason column just provides an explanation of the result I want to generate.

逻辑:对于每个id_group,如果((metric_LHS [entity]> Metric_RHS [group中的其他实体])&(metric_LHS [entity]> Baseline [entity])),机会?= 1

那么在我的例子中,机会?Jim的列等于1,因为Metric_LHS(Jim)> Metric_RHS(Jack)和Metric_LHS(Jim)> Baseline(Jim).同时,Rick例如为0,因为该条件不适用于组中唯一的其他人Joe.

So in my example, the Opportunity? column equals 1 for Jim because Metric_LHS(Jim) > Metric_RHS(Jack) and Metric_LHS(Jim)>Baseline(Jim). Meanwhile, Rick is a 0, for example, because the criteria does not work for the only other person in the group, Joe.

有关我编写的一些代码和逻辑,请参见下文.我的问题是:如何在每个组的每一行中进行迭代,并将该行与该组中的其他行进行比较?

See below for some pieces of the code and logic that I have written. My question is the following: How do I iterate within each row of each group and compare that row with every other row in that group?

id_group=df.groupby(['Category','Level'])
    for row in id_group:
       df['Opportunity?'](([df[metric_LHS][row]>df[Metric_RHS][row+1]) &\
       (df[metric_LHS][row]>df[Baseline][row])) = 1  
***How to iterate to next row in group?***

推荐答案

以这种方式在groupby对象上进行迭代时,返回的对象将是一个元组( index group ).

When iterating this way over a groupby object, the returned object will be a tuple (index, group).

要遍历每个 group 的行,可以使用

To iterate over the rows for each group, you could use DataFrame.itterrows.

类似这样的东西:

id_group=df.groupby(['Category','Level'])

for g_idx, group in id_group:
    for r_idx, row in group.iterrows():
        if (((row['Metric_LHS'] > group['Metric_RHS']).any())
             & (row['Metric_LHS'] > row['Baseline'])):
            df.loc[r_idx, 'Opportunity?'] = 1

使用提供的玩具数据的工作示例

df = pd.DataFrame({'Name':['Jim', 'Jack', 'Greg', 'Alex', 'Steve', 'Jack', 'Rick', 'Joe', 'Bill', 'Dave', 'Dan'],
        'Category':['South']*3 + ['North']*3 + ['West']*3 + ['East']*2,
        'Level': [1,1,2,2.5,2.5,2.5,3,3,3.25,4,4],
        'Metric_LHS': [100,80,70,110,90,105,110,111,90,87,83],
        'Metric_RHS': [120,90,75,115,95,110,112,113,95,90,85],
        'Baseline': [95,np.nan,73,112,85,103,105,112,93,75,81],
        'Opportunity?': [np.nan]*11})


id_group=df.groupby(['Category','Level'])

for g_idx, group in id_group:
    for r_idx, row in group.iterrows():
        if (((row['Metric_LHS'] > group['Metric_RHS']).any())
             & (row['Metric_LHS'] > row['Baseline'])):
            df.loc[r_idx, 'Opportunity?'] = 1


print(df)

     Name Category  Level  Metric_LHS  Metric_RHS  Baseline  Opportunity?
0     Jim    South   1.00         100         120      95.0           1.0
1    Jack    South   1.00          80          90       NaN           NaN
2    Greg    South   2.00          70          75      73.0           NaN
3    Alex    North   2.50         110         115     112.0           NaN
4   Steve    North   2.50          90          95      85.0           NaN
5    Jack    North   2.50         105         110     103.0           1.0
6    Rick     West   3.00         110         112     105.0           NaN
7     Joe     West   3.00         111         113     112.0           NaN
8    Bill     West   3.25          90          95      93.0           NaN
9    Dave     East   4.00          87          90      75.0           1.0
10    Dan     East   4.00          83          85      81.0           NaN

这篇关于 pandas :Groupby,并在组内使用条件进行迭代?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆