如果多个列与字典中的值匹配,则用值填充数据框列 [英] Fill dataframe column with a value if multiple columns match values in a dictionary

查看:34
本文介绍了如果多个列与字典中的值匹配,则用值填充数据框列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框-一个大的数据框,其中包含多个分类列,而一个列中缺少值,而另一个则类似于具有相同分类列的字典和一个具有键值的列.

I have two dataframes - one large dataframe with multiple categorical columns and one column with missing values, and another that's sort of a dictionary with the same categorical columns and one column with a key value.

基本上,如果所有分类列都匹配,我想用第二个键值填充大数据框中的缺失值.

Essentially, I want to fill the missing values in the large dataframe with the key value in the second if all the categorical columns match.

缺少值df:

    Color      Number    Letter       Value
0   Red          2          B          NaN
1   Green        2          A          NaN 
2   Red          2          B          NaN
3   Red          1          B          NaN
4   Green        1          A          NaN
5   Red          2          B          NaN
6   Green        1          B          NaN
7   Green        2          A          NaN

字典df:

    Color      Number    Letter       Value
0   Red          1          A          10
1   Red          1          B           4
2   Red          2          A           3
3   Red          2          B          15
4   Green        1          A          21
5   Green        1          B           9
6   Green        2          A          22
7   Green        2          B           1

所需的df:

0   Red          2          B          15
1   Green        2          A          22 
2   Red          2          B          15
3   Red          1          B          4
4   Green        1          A          21
5   Red          2          B          15
6   Green        1          B          9
7   Green        2          A          22

我不确定是否应该将'dictionary df'作为实际的字典,还是将其保留为数据帧(从csv中提取).

I'm not sure if I should have the 'dictionary df' as an actual dictionary, or keep it as a dataframe (it's pulled from a csv).

是否可以在没有大量if else语句的情况下进行干净的处理?

Is this possible to do cleanly without a myriad of if else statements?

谢谢!

推荐答案

这项工作有效吗?

>>> df_1[['Color', 'Number', 'Letter']].merge(df_2, 
...                                           on=('Color', 'Number', 'Letter'),   
...                                           how='left')
   Color  Number Letter  Value
0    Red       2      B     15
1  Green       2      A     22
2    Red       2      B     15
3    Red       1      B      4
4  Green       1      A     21
5    Red       2      B     15
6  Green       1      B      9
7  Green       2      A     22

值得一提的是-一种将示例从stackoverflow熊猫问题转换为数据框的非常简单的方法,只需将其剪切并粘贴到这样的字符串中即可:

Thought it worth mentioning - a very simple way to convert examples from stackoverflow pandas questions into a dataframe, just cut and paste it into a string like this:

>>> df_1 = pd.read_csv(StringIO("""
...     Color      Number    Letter       Value
... 0   Red          2          B          NaN
... 1   Green        2          A          NaN
... 2   Red          2          B          NaN
... 3   Red          1          B          NaN
... 4   Green        1          A          NaN
... 5   Red          2          B          NaN
... 6   Green        1          B          NaN
... 7   Green        2          A          NaN
... """), sep=r'\s+')

这篇关于如果多个列与字典中的值匹配,则用值填充数据框列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆