在单元测试中比较(断言)两个包含 numpy 数组的复杂数据结构 [英] Compare (assert equality of) two complex data structures containing numpy arrays in unittest

查看:53
本文介绍了在单元测试中比较(断言)两个包含 numpy 数组的复杂数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 Python 的 unittest 模块并想检查两个复杂的数据结构是否相等.对象可以是具有各种值的字典列表:数字、字符串、Python 容器(列表/元组/字典)和 numpy 数组.后者是问这个问题的原因,因为我不能只是做

I use Python's unittest module and want to check if two complex data structures are equal. The objects can be lists of dicts with all sorts of values: numbers, strings, Python containers (lists/tuples/dicts) and numpy arrays. The latter are the reason for asking the question, because I cannot just do

self.assertEqual(big_struct1, big_struct2)

因为它产生一个

ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()

我想我需要为此编写自己的平等测试.它应该适用于任意结构.我目前的想法是一个递归函数:

I imagine that I need to write my own equality test for this. It should work for arbitrary structures. My current idea is a recursive function that:

  • 尝试将arg1的当前节点"与arg2的对应节点进行直接比较;
  • 如果没有引发异常,则继续(终端"节点/叶也在此处处理);
  • 如果 ValueError 被捕获,继续深入直到找到 numpy.array;
  • 比较数组(例如像这样).
  • tries direct comparison of the current "node" of arg1 to the corresponding node of arg2;
  • if no exception is raised, moves on ("terminal" nodes/leaves are processed here, too);
  • if ValueError is caught, goes deeper until it finds a numpy.array;
  • compares the arrays (e.g. like this).

跟踪两个结构的对应"节点似乎有点问题,但也许 zip 就是我所需要的.

What seems a little problematic is keeping track of "corresponding" nodes of two structures, but perhaps zip is all I need here.

问题是:这种方法有没有好的(更简单的)替代方法? 也许 numpy 会为此提供一些工具?如果没有建议的替代方案,我将实施这个想法(除非我有更好的想法)并作为答案发布.

The question is: are there good (simpler) alternatives to this approach? Maybe numpy presents some tools for this? If no alternatives are suggested, I will implement this idea (unless I have a better one) and post as an answer.

附言我有一种模糊的感觉,我可能看到过一个解决这个问题的问题,但我现在找不到了.

P.S. I have a vague feeling that I might have seen a question addressing this problem, but I can't find it now.

P.P.S.另一种方法是遍历结构并将所有 numpy.array 转换为列表的函数,但这是否更容易实现?对我来说似乎一样.

P.P.S. An alternative approach would be a function that traverses the structure and converts all numpy.arrays to lists, but is this any easier to implement? Seems the same to me.

子类化 numpy.ndarray 听起来很有希望,但显然我没有将比较的两边硬编码到测试中.不过,其中之一确实是硬编码的,所以我可以:

Subclassing numpy.ndarray sounds very promising, but obviously I don't have both sides of the comparison hard-coded into a test. One of them, though, is indeed hardcoded, so I can:

  • numpy.array 的自定义子类填充它;
  • jterrace 的回答;
  • 在比较中始终将其用作 LHS.
  • populate it with custom subclasses of numpy.array;
  • change isinstance(other, SaneEqualityArray) to isinstance(other, np.ndarray) in jterrace's answer;
  • always use it as LHS in comparisons.

我在这方面的问题是:

  1. 它会起作用吗(我的意思是,这对我来说听起来没问题,但可能无法正确处理一些棘手的边缘情况)?在递归相等性检查中,我的自定义对象是否总是像我预期的那样以 LHS 结束?
  2. 再说一次,有没有更好的方法(假设我至少得到了一个具有真实 numpy 数组的结构).
  1. Will it work (I mean, it sounds all right to me, but maybe some tricky edge cases will not be handled correctly)? Will my custom object always end up as LHS in the recursive equality checks, as I expect?
  2. Again, are there better ways (given that I get at least one of the structures with real numpy arrays).

<小时>

编辑 2:我试过了,(看似)有效的实现显示在 this answer.


Edit 2: I tried it out, the (seemingly) working implementation is shown in this answer.

推荐答案

所以 jterrace 说明的想法似乎可行对我稍作修改:

So the idea illustrated by jterrace seems to work for me with a slight modification:

class SaneEqualityArray(np.ndarray):
    def __eq__(self, other):
        return (isinstance(other, np.ndarray) and self.shape == other.shape and 
            np.allclose(self, other))

就像我说的,带有这些对象的容器应该在等式检查的左侧.我从现有的 numpy.ndarray 像这样创建 SaneEqualityArray 对象:

Like I said, the container with these objects should be on the left side of the equality check. I create SaneEqualityArray objects from existing numpy.ndarrays like this:

SaneEqualityArray(my_array.shape, my_array.dtype, my_array)

按照ndarray构造函数签名:

ndarray(shape, dtype=float, buffer=None, offset=0,
        strides=None, order=None)

这个类是在测试套件中定义的,仅用于测试目的.相等检查的 RHS 是被测试函数返回的实际对象,包含真正的 numpy.ndarray 对象.

This class is defined within the test suite and serves for testing purposes only. The RHS of the equality check is an actual object returned by the tested function and contains real numpy.ndarray objects.

附言感谢到目前为止发布的两个答案的作者,他们都非常有帮助.如果有人发现这种方法有任何问题,我们将非常感谢您的反馈.

P.S. Thanks to the authors of both answers posted so far, they were both very helpful. If anyone sees any problems with this approach, I'd appreciate your feedback.

这篇关于在单元测试中比较(断言)两个包含 numpy 数组的复杂数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆