python大 pandas :如何在一个数据帧中找到行,但在另一个数据帧中找不到行? [英] python pandas: how to find rows in one dataframe but not in another?

查看:105
本文介绍了python大 pandas :如何在一个数据帧中找到行,但在另一个数据帧中找不到行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个表: people_all people_usa ,都具有相同的结构,因此相同主键。



如何获取不在美国的人的桌子?
在SQL中,我将执行以下操作:

 从people_all a $ b中选择一个* 

$ b left outer join people_usa u
on a.id = u.id

其中u.id为空

Python相当于什么?我不能想到一种将这个语句翻译成大熊猫语法的方法。



我想到的唯一方法是向people_usa添加一个任意字段(例如 people_usa ['dummy'] = 1 ),做一个左连接,然后只采用dummy是nan的记录,然后删除虚拟字段 - 这似乎有点复杂。 / p>

谢谢!

解决方案

使用 isin 并取消布尔掩码:

  people_usa [〜people_usa ['ID']。isin(people_all [ 'ID'])] 

示例:

 在[364]:
people_all = pd.DataFrame({'ID':np.arange(5)})
people_usa = pd.DataFrame({ 'ID':[3,4,6,7,100]})
people_usa [〜people_usa ['ID']。isin(people_all ['ID'])]

输出[364 ]:
ID
2 6
3 7
4 100

所以3和4从结果中删除,布尔掩码看起来像thi s:

 在[366]中:
people_usa ['ID']。isin(people_all ['ID'] )

输出[366]:
0 True
1 True
2 False
3 False
4 False
名称: ID,dtype:bool

使用面具


Let's say that I have two tables: people_all and people_usa, both with the same structure and therefore the same primary key.

How can I get a table of the people not in the USA? In SQL I'd do something like:

select a.*
from people_all a

left outer join people_usa u
on a.id = u.id

where u.id is null

What would be the Python equivalent? I cannot think of a way to translate this where statement into pandas syntax.

The only way I can think of is to add an arbitrary field to people_usa (e.g. people_usa['dummy']=1), do a left join, then take only the records where 'dummy' is nan, then delete the dummy field - which seems a bit convoluted.

Thanks!

解决方案

use isin and negate the boolean mask:

people_usa[~people_usa['ID'].isin(people_all ['ID'])]

Example:

In [364]:
people_all = pd.DataFrame({ 'ID' : np.arange(5)})
people_usa = pd.DataFrame({ 'ID' : [3,4,6,7,100]})
people_usa[~people_usa['ID'].isin(people_all['ID'])]

Out[364]:
    ID
2    6
3    7
4  100

so 3 and 4 are removed from the result, the boolean mask looks like this:

In [366]:
people_usa['ID'].isin(people_all['ID'])

Out[366]:
0     True
1     True
2    False
3    False
4    False
Name: ID, dtype: bool

using ~ inverts the mask

这篇关于python大 pandas :如何在一个数据帧中找到行,但在另一个数据帧中找不到行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆