比较2个dfs并根据df2中的查询在df1中附加值 [英] Compare 2 dfs and append values in df1 based on query in df2
问题描述
我有2个数据帧:我想从df2中取出值,并为从df1中的db(在df2中)获取数据的每次迭代附加price_1,price_2,price_3,price_4的值,以匹配df1.id = df2.id和df1.name = df2.name
I have 2 dataframes: I want to take out values from df2 and append values of price_1, price_2, price_3, price_4 for each iteration of getting data from db (in df2) in df1 for matching df1.id = df2.id and df1.name = df2.name
df1:
id name tag price_1 price_2 price_3 price_4
1 a a1
1 b b1
1 c c1
2 x d1
2 y e1
2 z a1
df2(结果为db):
df2(results form db):
1st iteration
id name tag price_1 price_2 price_3 price_4 discount
1 a x1 10 11 12 11 Y
1 b x2 11 44 22 55 Y
1 c x3 76 56 45 34 N
2nd iteration
id name tag price_1 price_2 price_3 price_4 discount
2 x x2 10 11 12 11 N
2 y x5 11 44 22 55 Y
2 z x6 76 56 45 34 N
输出:
df1 (after 1st iteration)
id name tag price_1 price_2 price_3 price_4
1 a a1 10 11 12 11
1 b b1 11 44 22 55
1 c c1 76 56 45 34
2 x
2 y
2 z
df1 (after 2nd iteration)
id name tag price_1 price_2 price_3 price_4
1 a a1 10 11 12 11
1 b b1 11 44 22 55
1 c c1 76 56 45 34
2 x d1 10 11 12 11
2 y e1 11 44 22 55
2 z a1 76 56 45 34
循环:
grouped = df1.groupby('id')
for i,groups in grouped:
df2 = sql(i) #goes to sql to fetch details for df1.id
sql_df = df2.name.unique()
dd = groups.name
if (set(sql_df) == set(sql_df) & set(dd)) & (set(dd) == set(sql_df) & set(dd)):
print ("ID:", i, "Names Match: Y")
for df2 in iter:
df4 = pd.DataFrame()
df_temp = df1[['id', 'name']].merge(df2, on = ['id', 'name'])
df4 = df4.append(df_temp, ignore_index = True)
else:
print("ID:", i, "Names Match: N")
我不需要标签
和折扣
来自 df2
的列,我只需要比较df1和df22中 name
是否相等。如果是,则取所有price_1 / 2/3/4
I don't need the tag
and discount
columns from df2
, I just need to compare if name
is equal in both df1 and df22. If yes, then take all the price_1/2/3/4
推荐答案
您可以使用 DataFrame.append
对于每个迭代,默认情况下仅内部合并合并行:
You can use DataFrame.append
for each iteration only merged rows by default inner join:
df4 = pd.DataFrame()
grouped = df1.groupby('id')
for i,groups in grouped:
df2 = sql(i) #goes to sql to fetch details for df1.id
sql_df = df2.name.unique()
dd = groups.name
if (set(sql_df) == set(sql_df) & set(dd)) & (set(dd) == set(sql_df) & set(dd)):
print ("ID:", i, "Names Match: Y")
df_temp = (df1[['id', 'name', 'tag']].merge(df2.drop('tag', axis=1),
on = ['id', 'name']))
df4 = df4.append(df_temp, ignore_index = True)
else:
print("ID:", i, "Names Match: N")
这篇关于比较2个dfs并根据df2中的查询在df1中附加值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!