Python Pandas基于组的正确填充值 [英] Python Pandas right fill values based on groups

查看:318
本文介绍了Python Pandas基于组的正确填充值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试复制一个类似于Excel的"right fill"函数,该函数会正确填充值,直到下一个值不为null/nan/empty.仅当紧接在其后一行中的值不为空或为"nan"时,才执行此右填充"练习.此外,必须对每个小组都这样做.我有以下熊猫数据框数据集.我当前的输入表是"have".我的输出表是想要".

I am trying to replicate a "right fill" excel-like function which fills the values right till the next value is not null/nan/empty. This "right-fill" exercise is only to be done if the value in the immediate following row in not empty or "nan". Furthermore, this has to be done for every group. I have the following pandas dataframe dataset. My current input table is "have". My output table is "want".

我只是python的初学者.因此,任何帮助将不胜感激. 另外,对于那些希望通过逐组操作执行此操作的人,数据如下: 表"have"如下,其中分组字段为"groups":

I am just a beginner in python. So any help would be appreciated. Also for those who would like this operation to be undertaken on a by group operation, data as follows: Table "have" as follows with grouping field "groups":

import pandas as pd
    have = pd.DataFrame({ \
    "groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \
    ,"0": pd.Series(["abc","1","something here","abc2","1","something here"]) \
    ,"1": pd.Series(["","2","something here","","","something here"]) \
    ,"2": pd.Series(["","3","something here","","3","something here"]) \
    ,"3": pd.Series(["something","1","something here","something","1","something here"]) \
    ,"4": pd.Series(["","2","something here","","2","something here"]) \
    ,"5": pd.Series(["","","something here","","","something here"]) \
    ,"6": pd.Series(["","","something here","","","something here"]) \
    ,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \
    ,"8": pd.Series(["","6","something here","","6","something here"]) \
    ,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \
    })

具有分组字段"groups"的表"want":

Table "want" with grouping fields "groups":

import pandas as pd
    want = pd.DataFrame({ \
    "groups": pd.Series(["group1","group1","group1","group2","group2","group2"]) \
    ,"0": pd.Series(["abc","1","something here","anything","1","something here"]) \
    ,"1": pd.Series(["abc","2","something here"," anything ","2","something here"]) \
    ,"2": pd.Series(["abc","3","something here"," anything ","3","something here"]) \
    ,"3": pd.Series(["something","1","something here","","","something here"]) \
    ,"4": pd.Series(["something ","2","something here","","","something here"]) \
    ,"5": pd.Series(["","","something here","","","something here"]) \
    ,"6": pd.Series(["","","something here","","","something here"]) \
    ,"7": pd.Series(["cdf","5","something here","mnop","5","something here"]) \
    ,"8": pd.Series(["cdf ","6","something here"," mnop ","6","something here"]) \
    ,"9": pd.Series(["xyz","1","something here","xyz","1","something here"]) \
    })

我尝试使用此代码,但我仍在尝试使自己熟悉groupbyapply语句:

I tried to use this code, but I am still trying to familiar myself with groupby and apply statements:

grouped=have.groupby('groups') 
have.groupby('groups').apply(lambda g: have.loc[g].isnull() )
#cond = have.loc[1].isnull() | have.loc[1].ne('')
want.loc[0, cond] = want.loc[0, cond].str.strip().replace('', None)
want

推荐答案

def fill(df):
    df = df.copy()
    i0, i1 = df.index[0], df.index[1]
    cond = have.loc[i1].isnull() | have.loc[i1].ne('')
    df.loc[i0, cond] = df.loc[i0, cond].str.strip().replace('', None)
    return df


have.groupby('groups', group_keys=False).apply(fill)

这篇关于Python Pandas基于组的正确填充值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆