在大 pandas 中以近似相等的数值比较左联接 [英] left join in pandas with approximately equal numeric comparison
问题描述
我正在使用以下命令在熊猫中进行左联接:
I am using the following to do a left join in Pandas:
merged_left = pd.merge(left=xrf_df,
right=statistics_and_notes_df,
how='left',
left_on=depth_column_name,
right_on='Core Depth')
但是depth_column_name和"Core Depth"列是浮点数. 是否有一个很好的方法可以进行这种左连接,使比较近似相等,例如np.isclose()?
however the depth_column_name and 'Core Depth' columns are floating point numbers. Is there a good way to do this left join such that the comparison is approximately equal such as np.isclose()?
推荐答案
假设我们有以下DF:
In [111]: a
Out[111]:
a b c
0 3.03 c 3
1 1.01 a 1
2 2.02 b 2
In [112]: b
Out[112]:
a x
0 1.02 Z
1 5.00 Y
2 3.04 X
让我们将连接float64列设置为索引(排序):
Let's set joining float64 column as index (sorted):
In [113]: a = a.sort_values('a').set_index('a')
In [114]: b = b.assign(idx=b['a']).set_index('idx').sort_index()
In [115]: a
Out[115]:
b c
a
1.01 a 1
2.02 b 2
3.03 c 3
In [116]: b
Out[116]:
a x
idx
1.02 1.02 Z
3.04 3.04 X
5.00 5.00 Y
现在我们可以使用 DataFrame.reindex(. ..,method ='nearest'):
In [118]: a.join(b.reindex(a.index, method='nearest'), how='left')
Out[118]:
b c a x
a
1.01 a 1 1.02 Z
2.02 b 2 1.02 Z
3.03 c 3 3.04 X
In [119]: a.join(b.reindex(a.index, method='nearest'), how='left').rename(columns={'a':'a_right'})
Out[119]:
b c a_right x
a
1.01 a 1 1.02 Z
2.02 b 2 1.02 Z
3.03 c 3 3.04 X
In [120]: a.join(b.reindex(a.index, method='nearest'), how='left').rename(columns={'a':'a_right'}).reset_index()
Out[120]:
a b c a_right x
0 1.01 a 1 1.02 Z
1 2.02 b 2 1.02 Z
2 3.03 c 3 3.04 X
您可能要使用df.reindex(..., tolerance=<value>)
参数来设置公差:abs(index[indexer] - target) <= tolerance
PS you may want to use df.reindex(..., tolerance=<value>)
parameter in order to set the tolerance: abs(index[indexer] - target) <= tolerance
这篇关于在大 pandas 中以近似相等的数值比较左联接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!