比较2个dfs并根据df2中的查询在df1中附加值 [英] Compare 2 dfs and append values in df1 based on query in df2

查看:109
本文介绍了比较2个dfs并根据df2中的查询在df1中附加值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个数据帧:我想从df2中取出值,并为从df1中的db(在df2中)获取数据的每次迭代附加price_1,price_2,price_3,price_4的值,以匹配df1.id = df2.id和df1.name = df2.name

I have 2 dataframes: I want to take out values from df2 and append values of price_1, price_2, price_3, price_4 for each iteration of getting data from db (in df2) in df1 for matching df1.id = df2.id and df1.name = df2.name

df1:

id  name   tag    price_1 price_2 price_3 price_4
1   a      a1         
1   b      b1
1   c      c1
2   x      d1
2   y      e1
2   z      a1
           

df2(结果为db):

df2(results form db):

1st iteration
id  name   tag  price_1 price_2 price_3 price_4  discount
1   a      x1   10      11      12      11       Y
1   b      x2   11      44      22      55       Y
1   c      x3   76      56      45      34       N

2nd iteration
id  name   tag  price_1 price_2 price_3 price_4  discount
2   x      x2   10      11      12      11       N
2   y      x5   11      44      22      55       Y
2   z      x6   76      56      45      34       N

输出:

df1 (after 1st iteration)
id  name   tag    price_1 price_2 price_3 price_4
1   a      a1     10      11      12      11
1   b      b1     11      44      22      55
1   c      c1     76      56      45      34
2   x       
2   y       
2   z       

df1 (after 2nd iteration)
id  name   tag    price_1 price_2 price_3 price_4
1   a      a1     10      11      12      11
1   b      b1     11      44      22      55
1   c      c1     76      56      45      34
2   x      d1     10      11      12      11
2   y      e1     11      44      22      55
2   z      a1     76      56      45      34

循环:

grouped = df1.groupby('id')
    
    for i,groups in grouped:
        df2 = sql(i) #goes to sql to fetch details for df1.id
        sql_df = df2.name.unique()
        dd = groups.name
        if (set(sql_df) == set(sql_df) & set(dd)) & (set(dd) == set(sql_df) & set(dd)):
            print ("ID:", i, "Names Match: Y")
            for df2 in iter:
                df4 = pd.DataFrame()
                df_temp = df1[['id', 'name']].merge(df2, on = ['id', 'name'])
                df4 = df4.append(df_temp, ignore_index = True)
        else:
            print("ID:", i, "Names Match: N")

我不需要标签折扣来自 df2 的列,我只需要比较df1和df22中 name 是否相等。如果是,则取所有price_1 / 2/3/4

I don't need the tag and discountcolumns from df2, I just need to compare if name is equal in both df1 and df22. If yes, then take all the price_1/2/3/4

推荐答案

您可以使用 DataFrame.append 对于每个迭代,默认情况下仅内部合并合并行:

You can use DataFrame.append for each iteration only merged rows by default inner join:

df4 = pd.DataFrame()
grouped = df1.groupby('id')
    
    for i,groups in grouped:
        df2 = sql(i) #goes to sql to fetch details for df1.id
        sql_df = df2.name.unique()
        dd = groups.name
        if (set(sql_df) == set(sql_df) & set(dd)) & (set(dd) == set(sql_df) & set(dd)):
            print ("ID:", i, "Names Match: Y")
            df_temp = (df1[['id', 'name', 'tag']].merge(df2.drop('tag', axis=1), 
                                                        on = ['id', 'name']))
            df4 = df4.append(df_temp, ignore_index = True)
        else:
            print("ID:", i, "Names Match: N")

这篇关于比较2个dfs并根据df2中的查询在df1中附加值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆