根据Python pandas 中记录的补充来挑选元素 [英] picking out elements based on complement of records in Python pandas

查看:44
本文介绍了根据Python pandas 中记录的补充来挑选元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个python pandas DataFrame问题.有两个包含记录的DataFrame,分别是 df1 df2 .它们包含以下值:

I have a python pandas DataFrame question. There are two DataFrames containing records, df1 and df2. They contain the following values:

df1:
   pkid  start   end
0     0   2005  2005
1     1   2006  2006
2     2   2007  2007
3     3   2008  2008
4     4   2009  2009

df2:
   pkid  start   end
0     3   2008  2008
1   NaN   2009  2009
2   NaN   2010  2010

我希望将w/index = 2的记录与 df2 隔离.换句话说,我正在寻找 df2 的所有记录,而在 df1 中没有匹配的记录,其中仅考虑开始和结束列的值.谢谢!

I am looking to isolate the record w/index=2 from df2. In other words, I am looking to find all records of df2 where there are not matching records in df1 where only the start and end column values are considered. Thanks!

推荐答案

此操作称为

This operation called antijoin (▷) in relational algebra and SQL. I've tried to find native pandas operation for this, but found nothing.

但是您可以通过功能的方式来实现它,不了解性能:)

But you can do it functional way, don't know about performance :)

>>> t1 = df1[["start", "end"]]
>>> t2 = df2[["start", "end"]]
>>> f = t2.apply(lambda x2: t1.apply(lambda x1: x1.isin(x2).all(), axis=1).any(), axis=1)
>>> df2[~f]
    end  pkid  start
2  2010   NaN   2010


更新: 在SQL中,它可以通过不同的方式完成,例如not exists:


update: In SQL, it can be done by different ways, like not exists:

select *
from df2
where not exists (select * from df1 where df1.start = df2.start and df1.end = df2.end)

left outer joinwhere子句:

select *
from df1
    left outer join df1 on df1.start = df2.start and df1.end = df1.end
where df1.<key> is null

最后一个可以使用 merge :

Last one could be implemented in pandas with merge:

>>> m = pd.merge(df2, df1, how='left', on=['end','start'], suffixes=['','_r'])
>>> df2[m['pkid_r'].isnull()]
    end  pkid  start
2  2010   NaN   2010

这篇关于根据Python pandas 中记录的补充来挑选元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆