如何使用 pandas 打印两列的差异? [英] How to use pandas to print the difference of two columns?
本文介绍了如何使用 pandas 打印两列的差异?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有两个数据集
1设置有一个带有电子邮件地址列表的列:
1 set it has a column with a list of email address:
DF1
Email
xxxx@abc.gov
xxxx@abc.gov
xxxx@abc.gov
xxxx@abc.gov
xxxx@abc.gov
第二个CSV 数据框2
2nd csv Dataframe2
Email
xxxx@abc.gov
xxxx@abc.gov
xxxx@abc.gov
xxxx@abc.gov
dddd@abc.com
dddd@abc.com
3333@abc.com
import pandas as pd
SansList = r'C:\\Sans compare\\SansList.csv'
AllUsers = r'C:\\Sans compare\\AllUser.csv'
## print Name column only and turn into data sets from CSV ##
df1 = pd.read_csv(SansList, usecols=[0])
df2 = pd.read_csv(AllUsers, usecols=[2])
**print(df1['Email'].isin(df2)==False)**
我希望结果是
Dataframe3
dddd@abc.com
dddd@abc.com
3333@abc.com
不太确定如何修复我的数据集...:(
Not quite sure how to fix my dataset... :(
推荐答案
选项1
isin
Option 1
isin
df2[~df2.Email.isin(df1.Email)]
Email
4 dddd@abc.com
5 dddd@abc.com
6 3333@abc.com
选项2
query
Option 2
query
df2.query('Email not in @df1.Email')
Email
4 dddd@abc.com
5 dddd@abc.com
6 3333@abc.com
选项3
merge
Option 3
merge
pd.DataFrame.merge
和indicator=True
,可让您查看该行来自哪个数据帧.然后我们可以对其进行过滤.
pd.DataFrame.merge
with indicator=True
, enables you to see which dataframe the row came from. We can then filter on it.
df2.merge(
df1, 'outer', indicator=True
).query('_merge == "left_only"').drop('_merge', 1)
Email
20 dddd@abc.com
21 dddd@abc.com
22 3333@abc.com
这篇关于如何使用 pandas 打印两列的差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文