如何在右侧数据框中获取不在左侧数据框中的数据 [英] How to get data in the right dataframe that isn't in the left dataframe
本文介绍了如何在右侧数据框中获取不在左侧数据框中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有两个数据帧,我试图输出其中一个而不是另一个的数据.
I have two data frames and I am trying to output the data that is in one but not the other.
我可以使用第一个数据帧获取数据,而不能使用第二个数据帧
I can get the data in the first dataframe but not the second using
only_new = old.merge(
new, 'outer', on=['Employee ID', 'Benefit Plan Type'],
suffixes=['','_'], indicator=True
).query('_merge == "left_only"').reindex_axis(old.columns, axis=1)
这就是我用来获取仅在第二个数据帧中的数据的东西
Here is what I'm using to get the data that's only in my second dataframe
only_new =new.merge(
old, 'outer', on=['Employee ID', 'Benefit Plan Type'],
suffixes=['','_'], indicator=True
).query('_merge == "left only"').reindex_axis(new.columns, axis=1)
但是它不返回任何数据,但是使用Excel我可以看到应该有几行.
But it doesn't return any data, but using Excel I can see that there should be a couple of rows.
看来这应该可行
only_new = old.merge(new, on='Employee ID', indicator=True, how='outer',
only_new[only_new['_merge'] == 'right_only'])
但是我明白了
SyntaxError: non-keyword arg after keyword arg
推荐答案
考虑数据帧old
和new
old = pd.DataFrame(dict(
ID=[1, 2, 3, 4, 5],
Type=list('AAABB'),
Total=[9 for _ in range(5)],
ArbitraryColumn=['blah' for _ in range(5)]
))
new = pd.DataFrame(dict(
ID=[3, 4, 5, 6, 7],
Type=list('ABBCC'),
Total=[9 for _ in range(5)],
ArbitraryColumn=['blah' for _ in range(5)]
))
然后采取对称相同的解决方案
Then to take the symmetrically identical solution
old.merge(
new, 'outer', on=['ID', 'Type'],
suffixes=['_', ''], indicator=True # changed order of suffixes
).query('_merge == "right_only"').reindex_axis(new.columns, axis=1)
# \......../ \./
# changed from `left` to `right` reindex with `new`
ArbitraryColumn ID Total Type
5 blah 6 9.0 C
6 blah 7 9.0 C
这篇关于如何在右侧数据框中获取不在左侧数据框中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文