仅当 pandas 中的值为空/空时才合并 [英] Merge Only When Value is Empty/Null in Pandas

查看:85
本文介绍了仅当 pandas 中的值为空/空时才合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Pandas中有两个数据帧,它们正在合并为df.A和df.B,df.A是原始数据,而df.B具有我要提交的新数据.合并工作正常,并且按预期,合并后的df中会出现两列col_x和col_y.

I have two dataframes in Pandas which are being merged together df.A and df.B, df.A is the original, and df.B has the new data I want to bring over. The merge works fine and as expected I get two columns col_x and col_y in the merged df.

但是,在某些行中,原始df.A具有其他df.B没有的值.我的问题是,我如何才能有选择地从col_x和col_y中获取值并将它们放入新的col中,例如col_z?

However, in some rows, the original df.A has values where the other df.B does not. My question is, how can I selectively take the values from col_x and col_y and place them into a new col such as col_z ?

这就是我的意思,如何合并df.A:

Here's what I mean, how can I merge df.A:

date   impressions    spend    col
1/1/15 100000         3.00     ABC123456
1/2/15 145000         5.00     ABCD00000
1/3/15 300000         15.00    (null)

与df.B

date    col
1/1/15  (null)
1/2/15  (null)
1/3/15  DEF123456

获得:

date   impressions    spend    col_z
1/1/15 100000         3.00     ABC123456
1/2/15 145000         5.00     ABCD00000
1/3/15 300000         15.00    DEF123456

我们将不胜感激任何帮助或指出正确的方向!

Any help or point in the right direction would be really appreciated!

谢谢

推荐答案

好,假设您的(null)值实际上是NaN值,而不是该字符串,则可以进行以下操作:

OK assuming that your (null) values are in fact NaN values and not that string then the following works:

In [10]:
# create the merged df
merged = dfA.merge(dfB, on='date')
merged

Out[10]:
        date  impressions  spend      col_x      col_y
0 2015-01-01       100000      3  ABC123456        NaN
1 2015-01-02       145000      5  ABCD00000        NaN
2 2015-01-03       300000     15        NaN  DEF123456

您可以使用 where 有条件地从_x和_y列中分配一个值:

You can use where to conditionally assign a value from the _x and _y columns:

In [11]:
# now create col_z using where
merged['col_z'] = merged['col_x'].where(merged['col_x'].notnull(), merged['col_y'])
merged

Out[11]:
        date  impressions  spend      col_x      col_y      col_z
0 2015-01-01       100000      3  ABC123456        NaN  ABC123456
1 2015-01-02       145000      5  ABCD00000        NaN  ABCD00000
2 2015-01-03       300000     15        NaN  DEF123456  DEF123456

然后您可以 drop 无关的列:

You can then drop the extraneous columns:

In [13]:

merged = merged.drop(['col_x','col_y'],axis=1)
merged

Out[13]:
        date  impressions  spend      col_z
0 2015-01-01       100000      3  ABC123456
1 2015-01-02       145000      5  ABCD00000
2 2015-01-03       300000     15  DEF123456

这篇关于仅当 pandas 中的值为空/空时才合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆