仅当 Pandas 中的值为空/Null 时才合并 [英] Merge Only When Value is Empty/Null in Pandas
问题描述
我在 Pandas 中有两个数据框,它们正在合并在一起 df.A 和 df.B,df.A 是原始数据,而 df.B 有我想要带来的新数据.合并工作正常,正如预期的那样,我在合并的 df 中得到两列 col_x 和 col_y.
I have two dataframes in Pandas which are being merged together df.A and df.B, df.A is the original, and df.B has the new data I want to bring over. The merge works fine and as expected I get two columns col_x and col_y in the merged df.
但是,在某些行中,原始 df.A 具有其他 df.B 没有的值.我的问题是,如何有选择地从 col_x 和 col_y 中获取值并将它们放入一个新的 col 中,例如 col_z ?
However, in some rows, the original df.A has values where the other df.B does not. My question is, how can I selectively take the values from col_x and col_y and place them into a new col such as col_z ?
这就是我的意思,我如何合并 df.A:
Here's what I mean, how can I merge df.A:
date impressions spend col
1/1/15 100000 3.00 ABC123456
1/2/15 145000 5.00 ABCD00000
1/3/15 300000 15.00 (null)
与 df.B
date col
1/1/15 (null)
1/2/15 (null)
1/3/15 DEF123456
获得:
date impressions spend col_z
1/1/15 100000 3.00 ABC123456
1/2/15 145000 5.00 ABCD00000
1/3/15 300000 15.00 DEF123456
任何帮助或指向正确方向的点都将不胜感激!
Any help or point in the right direction would be really appreciated!
谢谢
推荐答案
好的,假设您的 (null) 值实际上是 NaN 值而不是那个字符串,那么以下工作:
OK assuming that your (null) values are in fact NaN values and not that string then the following works:
In [10]:
# create the merged df
merged = dfA.merge(dfB, on='date')
merged
Out[10]:
date impressions spend col_x col_y
0 2015-01-01 100000 3 ABC123456 NaN
1 2015-01-02 145000 5 ABCD00000 NaN
2 2015-01-03 300000 15 NaN DEF123456
您可以使用where
从 _x 和 _y 列有条件地分配一个值:
You can use where
to conditionally assign a value from the _x and _y columns:
In [11]:
# now create col_z using where
merged['col_z'] = merged['col_x'].where(merged['col_x'].notnull(), merged['col_y'])
merged
Out[11]:
date impressions spend col_x col_y col_z
0 2015-01-01 100000 3 ABC123456 NaN ABC123456
1 2015-01-02 145000 5 ABCD00000 NaN ABCD00000
2 2015-01-03 300000 15 NaN DEF123456 DEF123456
然后您可以drop
无关的列:
You can then drop
the extraneous columns:
In [13]:
merged = merged.drop(['col_x','col_y'],axis=1)
merged
Out[13]:
date impressions spend col_z
0 2015-01-01 100000 3 ABC123456
1 2015-01-02 145000 5 ABCD00000
2 2015-01-03 300000 15 DEF123456
这篇关于仅当 Pandas 中的值为空/Null 时才合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!