pandas -2个数据框，将df1的Index列添加到第二列的df2 [英] Pandas - 2 dataframes, add Index column of df1 to df2 on second column

查看：234 发布时间：2020/5/6 9:28:39 pandas merge match

本文介绍了 pandas -2个数据框，将df1的Index列添加到第二列的df2的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有2个数据框:

df1 (sample, has more columns):

+---+----------------+--------------+-----------+
|   |     Region     | Placement ID |   Units   |
+---+----------------+--------------+-----------+
| 0 | Western Europe | 1.10872E+13  | 367628.76 |
| 1 | Western Europe | 1.10872E+13  | 367628.76 |
| 2 | Western Europe | 1.10872E+13  | 74604.63  |
+---+----------------+--------------+-----------+

df2 (sample, has more columns:

+-----------+----------------+--------------+
| Creatives | Publisher Name | Placement ID |
+-----------+----------------+--------------+
| Temenos   | Quantcast      | 1.10872E+13  |
| Temenos   | Quantcast      | 1.10872E+13  |
| Temenos   | Quantcast      | 1.10872E+13  |
+-----------+----------------+--------------+

我想做的是根据位置ID在数据框2中添加一个额外的列，其中数据框1的索引列.

What I would like to do is to add an extra column in dataframe 2 with the index column of dataframe 1 based on Placement ID.

数据框1或2中的某些展示位置ID字段可能为空，或者具有错误的值，以防不匹配或发现错误，那么我想添加一个Missing或Error值，例如N/A ，缺少或留空

Some placements Id fields in dataframe 1 or 2 might be empty, or have erroneous value, in case there is no match, or an error is found, then I would like to add a Missing or Error value such as N/A, Missing, or leave blank

推荐答案

IIUC，您需要 merge ，但是重复项存在问题，因此请先通过

IIUC you need merge, but there is problem with duplicates, so first remove them by drop_duplicates and then select columns one for adding and another for join (Placement ID):

print (pd.merge(df2, 
                df1.drop_duplicates('Placement ID')[['Units', 'Placement ID']], 
                how='left', 
                on='Placement ID'))


  Creatives Publisher Name  Placement ID      Units
0   Temenos      Quantcast  1.108720e+13  367628.76
1   Temenos      Quantcast  1.108720e+13  367628.76
2   Temenos      Quantcast  1.108720e+13  367628.76

如果需要添加索引，则需要 reset_index :

If need add index need reset_index:

print (pd.merge(df2, 
                df1.drop_duplicates('Placement ID')
                   .reset_index()[['level_0','Placement ID']], 
                how='left', 
                on='Placement ID'))
  Creatives Publisher Name  Placement ID  level_0
0   Temenos      Quantcast  1.108720e+13        0
1   Temenos      Quantcast  1.108720e+13        0
2   Temenos      Quantcast  1.108720e+13        0

需要删除重复项，因为merge通过连接的键有多行-在df2中有3个相同的值1.108720e+13和在df1中有3行，因此得到3 x 3行，如:

Need drop duplicates, because merge multiple rows by joined keys - there are 3 same values 1.108720e+13 in df2 and 3 rows in df1, so get 3 x 3 rows like:

print (pd.merge(df2, 
                df1.reset_index()[['level_0', 'Placement ID']], 
                how='left', 
                on='Placement ID'))

  Creatives Publisher Name  Placement ID  level_0
0   Temenos      Quantcast  1.108720e+13        0
1   Temenos      Quantcast  1.108720e+13        1
2   Temenos      Quantcast  1.108720e+13        2
3   Temenos      Quantcast  1.108720e+13        0
4   Temenos      Quantcast  1.108720e+13        1
5   Temenos      Quantcast  1.108720e+13        2
6   Temenos      Quantcast  1.108720e+13        0
7   Temenos      Quantcast  1.108720e+13        1
8   Temenos      Quantcast  1.108720e+13        2

这篇关于 pandas -2个数据框，将df1的Index列添加到第二列的df2的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas -2个数据框，将df1的Index列添加到第二列的df2 [英] Pandas - 2 dataframes, add Index column of df1 to df2 on second column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas -2个数据框，将df1的Index列添加到第二列的df2 [英] Pandas - 2 dataframes, add Index column of df1 to df2 on second column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭