使用浮点 NumPy 数组进行比较和相关操作 [英] Working with floating point NumPy arrays for comparison and related operations

查看:60
本文介绍了使用浮点 NumPy 数组进行比较和相关操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组随机浮点数,我需要将它与另一个以不同顺序具有相同值的数组进行比较.就此而言,我使用总和、乘积(以及其他组合,具体取决于表格的维度以及所需的方程数量).

I have an array of random floats and I need to compare it to another one that has the same values in a different order. For that matter I use the sum, product (and other combinations depending on the dimension of the table hence the number of equations needed).

尽管如此,我在根据值的顺序对数组执行总和(或乘积)时遇到了精度问题.

Nevertheless, I encountered a precision issue when I perform the sum (or product) on the array depending on the order of the values.

这是一个简单的独立示例来说明这个问题:

Here is a simple standalone example to illustrate this issue :

import numpy as np

n = 10
m = 4

tag = np.random.rand(n, m)

s1 = np.sum(tag, axis=1)
s2 = np.sum(tag[:, ::-1], axis=1)

# print the number of times s1 is not equal to s2 (should be 0)
print np.nonzero(s1 != s2)[0].shape[0]

如果您执行此代码,它有时会告诉您s1s2 不相等,并且差异是计算机精度的大小.

If you execute this code it sometimes tells you that s1 and s2 are not equal and the differents is of magnitude of the computer precision.

问题是我需要在诸如 np.in1d 之类的函数中使用那些我无法真正给出容忍度的函数......

The problem is I need to use those in functions like np.in1d where I can't really give a tolerance...

有没有办法避免这个问题?

Is there a way to avoid this issue?

推荐答案

对于列出的代码,您可以使用 np.isclose 并且也可以指定容差值.

For the listed code, you can use np.isclose and with it tolerance values could be specified too.

使用提供的示例,让我们看看如何使用它-

Using the provided sample, let's see how it could be used -

In [201]: n = 10
     ...: m = 4
     ...: 
     ...: tag = np.random.rand(n, m)
     ...: 
     ...: s1 = np.sum(tag, axis=1)
     ...: s2 = np.sum(tag[:, ::-1], axis=1)
     ...: 

In [202]: np.nonzero(s1 != s2)[0].shape[0]
Out[202]: 4

In [203]: (~np.isclose(s1,s2)).sum() # So, all matches!
Out[203]: 0

要在其他情况下使用容差值,我们需要逐案处理.因此,假设对于像 np.in1d 这样涉及元素比较的实现,我们可以引入 broadcasting 对第一个输入中的所有 elems 与第二个输入中的所有 elems 进行元素相等性检查.然后,我们使用 np.abs 来获得接近度因子",最后与输入容差进行比较以决定匹配.根据模拟np.in1d 的需要,我们沿其中一个轴进行任何操作.因此,np.in1d 使用 broadcasting 可以像这样实现 -

To make use of tolerance values in other scenarios, we need to work on a case-by-case basis. So, let's say for an implementation that involve elementwise comparison like in np.in1d, we can bring in broadcasting to do those elementwise equality checks for all elems in first input against all elems in the second one. Then, we use np.abs to get the "closeness factor" and finally compare against the input tolerance to decide the matches. As needed to simulate np.in1d, we do ANY operation along one of the axis. Thus, np.in1d with tolerance using broadcasting could be implemented like so -

def in1d_with_tolerance(A,B,tol=1e-05):
    return (np.abs(A[:,None] - B) < tol).any(1)

正如 OP 的评论中所建议的那样,我们还可以在放大后对浮点数进行舍入,这应该是内存高效的,因为这是处理大型数组所必需的.所以,修改后的版本会是这样 -

As suggested in the comments by OP, we can also round floating-pt numbers after scaling them up and this should be memory efficient, as being needed for working with large arrays. So, a modified version would be like so -

def in1d_with_tolerance_v2(A,B,tol=1e-05):
    S = round(1/tol)
    return np.in1d(np.around(A*S).astype(int),np.around(B*S).astype(int))

样品运行 -

In [372]: A = np.random.rand(5)
     ...: B = np.random.rand(7)
     ...: B[3] = A[1] + 0.0000008
     ...: B[6] = A[4] - 0.0000007
     ...: 

In [373]: np.in1d(A,B) # Not the result we want!
Out[373]: array([False, False, False, False, False], dtype=bool)

In [374]: in1d_with_tolerance(A,B)
Out[374]: array([False,  True, False, False,  True], dtype=bool)

In [375]: in1d_with_tolerance_v2(A,B)
Out[375]: array([False,  True, False, False,  True], dtype=bool)

最后,关于如何使其适用于其他实现和用例 - 这将取决于实现本身.但在大多数情况下,np.isclosebroadcasting 应该会有所帮助.

Finally, on how to make it work for other implementations and use cases - It would depend on the implementation itself. But for most cases, np.isclose and broadcasting should help.

这篇关于使用浮点 NumPy 数组进行比较和相关操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆