如何在仅几列的行中替换值而不在Pandas中合并 [英] how to replace value in a row of just few columns without merge in Pandas
问题描述
我有一个df1,其中有近10万行,而另一个df2中,只有约10行。我需要用df2替换df1中的值,以仅匹配两个列值的30行。
I have a df1 which has almost 100k rows and another df2 with just around 10 rows. I need to replace the value in df1 with df2 for just the 30 rows on exact matching of two column values.
df1:
Daily_No Date Name Type Amount
A12 30/01/2020 Bob R 400
A14 30/01/2020 Jacob R 1000
B01 31/01/2020 John D 20
df2:
Old_Date New_Date Daily_No Type Amount Balance
30/01/2020 05/02/2020 A12 R 10000 0
30/01/2020 06/02/2020 KO11 D 200 10
10/08/2020 06/02/2020 B01 D 5 0
我想将df1列 [Daily_No,** Date **,Name,Type,Amount]
替换为行值 [Daily_No, ** New_Date **,,类型,金额]
,换句话说就是复制New_Date和其他详细信息(Bala除外) nce和Old_Date:因为df1中已经没有余额,并且df1中的Date将被New_Date替换为Date和匹配 left_on = ['Date','Daily_No'],right_on = ['的其他列Old_Date','Daily_No']
I want to replace the df1 column [Daily_No, **Date**, Name, Type, Amount]
with values in row values of [Daily_No, **New_Date**, '' , Type , Amount]
respectively, in other words copy New_Date and other details (except Balance and Old_Date: as balances is not in df1 already and Date in df1 to be replaced by New_Date) to Date and other columns on matching left_on=['Date', 'Daily_No'], right_on=['Old_Date', 'Daily_No']
由于df2中只有10行,是否有更快的方法可以在匹配时复制列值这两个列?
As there is just 10 rows in df2, is there a faster way to copy the column values on matching the two columns?
所需的输出
Daily_No Date Name Type Amount
A12 05/02/2020 Bob R 10000 <---df2 row 1 match(replaced Date with New_Date, Type, Amount)
A14 30/01/2020 Jacob R 1000 <---df2 no matches
B01 31/01/2020 John D 20 <---df2 date didn't match
推荐答案
IIUC,您可以使用 combine_first
,因为您不想合并。
IIUC, you can use combine_first
as you don't want to merge.
df3 = df2.rename(columns={"Old_Date": "Date"}).set_index(["Date", "Daily_No"])[["Amount"]]\
.combine_first(
df1.set_index(['Date','Daily_No'])).dropna().reset_index()
print(df3)
Date Daily_No Amount Name Type
0 30/01/2020 A12 10000.0 Bob R
1 30/01/2020 A14 1000.0 Jacob R
2 31/01/2020 B01 20.0 John D
另一种方法是根据相互的列ffill& drop_duplicates只保留后面的值。
Another method would be to concat based on the mutual columns, ffill & drop_duplicates keeping only the later value.
df3 = pd.concat([df1,
df2.rename(columns={'Old_Date': 'Date'})[['Date','Amount','Daily_No']]
],axis=0,sort=False)
df3.fillna(df3.groupby(['Daily_No','Date'],
sort=False).ffill()).dropna().drop_duplicates(subset=['Daily_No','Date'],keep='last')
print(df3)
Daily_No Date Name Type Amount
1 A14 30/01/2020 Jacob R 1000
2 B01 31/01/2020 John D 20
0 A12 30/01/2020 Bob R 10000
这篇关于如何在仅几列的行中替换值而不在Pandas中合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!