pandas 左联接中不匹配的左表记录 [英] unmatched left table records in a left join in pandas
问题描述
我有两个数据框,学生数据框和费用数据框。 费用 DataFrame中缺少某些学生的费用明细。我想返回所有缺少费用详细信息的学生的详细信息。 班级,部分和 RollNo这三个字段构成唯一的组合。
I have two DataFrames, 'Students' DataFrame and 'Fee' DataFrame. The fee details of some of the students are missing in 'Fee' DataFrame. I would like to return the details of all students whose fee details are missing. The three fields 'Class', 'Section' and 'RollNo' form a unique combination.
Students = pd.DataFrame({
'Class': [7, 7, 8],
'Section': ['A', 'B', 'B'],
'RollNo': [2, 3, 4],
'Student': ['Ram', 'Rahim', 'Robert']
})
Fee = pd.DataFrame({
'Class': [7, 7, 8],
'Section': ['A', 'B', 'B'],
'RollNo': [2, 2, 3],
'Fee': [10, 20, 30]
})
Students
Class RollNo Section Student
0 7 2 A Ram
1 7 3 B Rahim
2 8 4 B Robert
Fee
Class Fee RollNo Section
0 7 10 2 A
1 7 20 2 B
2 8 30 3 B
基本上,我想在左侧表格中找到不匹配的记录根据上述3个字段在学生和费用数据框架之间进行左联接。在Python中使用 Pandas 来实现此目标的最简单方法是什么?
Essentially, I would like to find the unmatched records from the left table when I do a left join between 'Students' and 'Fee' DataFrames based on 3 fields mentioned above. What is the simplest way to achieve this using Pandas in Python?
非常感谢!
推荐答案
如果 Fee
列中的 NaN
没有, code>费用 DataFrame使用 合并
由 布尔索引
与 isna
:
If no NaN
s in Fee
column in Fee
DataFrame use merge
anf filter by boolean indexing
with isna
:
df = pd.merge(Students, Fee, how='left')
print (df)
Class RollNo Section Student Fee
0 7 2 A Ram 10.0
1 7 3 B Rahim NaN
2 8 4 B Robert NaN
df1 = df[df['Fee'].isna()].drop('Fee', axis=1)
#for oldier versions of pandas
#df1 = df[df['Fee'].isnull()].drop('Fee', axis=1)
print (df1)
Class RollNo Section Student
1 7 3 B Rahim
2 8 4 B Robert
使用 NaN
的更通用的解决方案,也将参数 indicator
添加到 merge
并使用 left_only
过滤行:
More general solution working with NaN
s too add parameter indicator
to merge
and filter rows with left_only
:
Fee = pd.DataFrame({'Class':[7,7,8],
'Section':['A','B','B'],
'RollNo':[2,2,3],
'Fee':[np.nan,20,30]})
print (Fee)
Class Fee RollNo Section
0 7 NaN 2 A
1 7 20.0 2 B
2 8 30.0 3 B
df = pd.merge(Students, Fee, how='left', indicator=True)
print (df)
Class RollNo Section Student Fee _merge
0 7 2 A Ram NaN both
1 7 3 B Rahim NaN left_only
2 8 4 B Robert NaN left_only
df1 = df[df['_merge'].eq('left_only')].drop(['Fee','_merge'], axis=1)
print (df1)
Class RollNo Section Student
1 7 3 B Rahim
2 8 4 B Robert
这篇关于 pandas 左联接中不匹配的左表记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!