如何从一个大 pandas 数据帧的行中减去另一行? [英] How to subtract rows of one pandas data frame from another?
问题描述
我要执行的操作类似于合并.例如,通过inner
合并,我们得到一个数据帧,其中包含第一个AND第二个数据帧中存在的行.通过outer
合并,我们得到一个数据帧,该数据帧出现在第二个数据帧的第一个或"中.
The operation that I want to do is similar to merger. For example, with the inner
merger we get a data frame that contains rows that are present in the first AND second data frame. With the outer
merger we get a data frame that are present EITHER in the first OR in the second data frame.
我需要的是一个数据帧,其中包含第一个数据帧中存在的行而第二个数据帧中不存在的行?有快速而优雅的方法吗?
What I need is a data frame that contains rows that are present in the first data frame AND NOT present in the second one? Is there a fast and elegant way to do it?
推荐答案
类似以下内容如何?
print df1
Team Year foo
0 Hawks 2001 5
1 Hawks 2004 4
2 Nets 1987 3
3 Nets 1988 6
4 Nets 2001 8
5 Nets 2000 10
6 Heat 2004 6
7 Pacers 2003 12
print df2
Team Year foo
0 Pacers 2003 12
1 Heat 2004 6
2 Nets 1988 6
只要有一个非关键的通用命名列,就可以让在sufffexes上执行的工作(如果没有非关键的通用列,则可以创建一个临时使用的列... df1['common'] = 1
和df2['common'] = 1
):
As long as there is a non-key commonly named column, you can let the added on sufffexes do the work (if there is no non-key common column then you could create one to use temporarily ... df1['common'] = 1
and df2['common'] = 1
):
new = df1.merge(df2,on=['Team','Year'],how='left')
print new[new.foo_y.isnull()]
Team Year foo_x foo_y
0 Hawks 2001 5 NaN
1 Hawks 2004 4 NaN
2 Nets 1987 3 NaN
4 Nets 2001 8 NaN
5 Nets 2000 10 NaN
或者您可以使用isin
,但是您必须创建一个密钥:
Or you can use isin
but you would have to create a single key:
df1['key'] = df1['Team'] + df1['Year'].astype(str)
df2['key'] = df1['Team'] + df2['Year'].astype(str)
print df1[~df1.key.isin(df2.key)]
Team Year foo key
0 Hawks 2001 5 Hawks2001
2 Nets 1987 3 Nets1987
4 Nets 2001 8 Nets2001
5 Nets 2000 10 Nets2000
6 Heat 2004 6 Heat2004
7 Pacers 2003 12 Pacers2003
这篇关于如何从一个大 pandas 数据帧的行中减去另一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!