如何在仅几列的行中替换值而不在Pandas中合并 [英] how to replace value in a row of just few columns without merge in Pandas

查看:136
本文介绍了如何在仅几列的行中替换值而不在Pandas中合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个df1,其中有近10万行,而另一个df2中,只有约10行。我需要用df2替换df1中的值,以仅匹配两个列值的30行。

I have a df1 which has almost 100k rows and another df2 with just around 10 rows. I need to replace the value in df1 with df2 for just the 30 rows on exact matching of two column values.

df1:
Daily_No     Date      Name      Type       Amount     
   A12    30/01/2020    Bob       R           400        
   A14    30/01/2020    Jacob     R           1000       
   B01    31/01/2020    John      D           20         

df2:
Old_Date       New_Date       Daily_No       Type       Amount     Balance
30/01/2020    05/02/2020         A12           R         10000         0
30/01/2020    06/02/2020        KO11           D          200         10
10/08/2020    06/02/2020        B01            D          5           0

我想将df1列 [Daily_No,** Date **,Name,Type,Amount] 替换为行值 [Daily_No, ** New_Date **,,类型,金额] ,换句话说就是复制New_Date和其他详细信息(Bala除外) nce和Old_Date:因为df1中已经没有余额,并且df1中的Date将被New_Date替换为Date和匹配 left_on = ['Date','Daily_No'],right_on = ['的其他列Old_Date','Daily_No']

I want to replace the df1 column [Daily_No, **Date**, Name, Type, Amount] with values in row values of [Daily_No, **New_Date**, '' , Type , Amount] respectively, in other words copy New_Date and other details (except Balance and Old_Date: as balances is not in df1 already and Date in df1 to be replaced by New_Date) to Date and other columns on matching left_on=['Date', 'Daily_No'], right_on=['Old_Date', 'Daily_No']

由于df2中只有10行,是否有更快的方法可以在匹配时复制列值这两个列?

As there is just 10 rows in df2, is there a faster way to copy the column values on matching the two columns?

所需的输出

Daily_No     Date      Name   Type    Amount     
   A12    05/02/2020    Bob    R      10000 <---df2 row 1 match(replaced Date with New_Date, Type, Amount)
   A14    30/01/2020    Jacob  R      1000 <---df2 no matches
   B01    31/01/2020    John   D        20  <---df2 date didn't match


推荐答案

IIUC,您可以使用 combine_first ,因为您不想合并。

IIUC, you can use combine_first as you don't want to merge.

df3 = df2.rename(columns={"Old_Date": "Date"}).set_index(["Date", "Daily_No"])[["Amount"]]\
    .combine_first(
    df1.set_index(['Date','Daily_No'])).dropna().reset_index()

print(df3)

         Date Daily_No   Amount   Name Type
0  30/01/2020      A12  10000.0    Bob    R
1  30/01/2020      A14   1000.0  Jacob    R
2  31/01/2020      B01     20.0   John    D






另一种方法是根据相互的列ffill& drop_duplicates只保留后面的值。


Another method would be to concat based on the mutual columns, ffill & drop_duplicates keeping only the later value.

df3 = pd.concat([df1,
       df2.rename(columns={'Old_Date': 'Date'})[['Date','Amount','Daily_No']]

      ],axis=0,sort=False)

df3.fillna(df3.groupby(['Daily_No','Date'],
            sort=False).ffill()).dropna().drop_duplicates(subset=['Daily_No','Date'],keep='last')

print(df3)


  Daily_No        Date   Name Type  Amount
1      A14  30/01/2020  Jacob    R    1000
2      B01  31/01/2020   John    D      20
0      A12  30/01/2020    Bob    R   10000

这篇关于如何在仅几列的行中替换值而不在Pandas中合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆