使用假设生成具有自定义值限制的列表列表 [英] Generating list of lists with custom value limitations with Hypothesis

查看:32
本文介绍了使用假设生成具有自定义值限制的列表列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

故事:

目前,我有一个待测函数,它需要一个整数列表列表,其规则如下:

Currently, I have a function-under-test that expects a list of lists of integers with the following rules:

  1. 子列表的数量(我们称之为 N)可以从 1 到 50
  2. 子列表内的值数量对于所有子列表(矩形形式)都是相同的,并且应该 >= 0 和 <= 5
  3. 子列表中的值不能大于或等于子列表的总数.换句话说,子列表中的每个值都是一个整数 >= 0 和 <;N
  1. number of sublists (let's call it N) can be from 1 to 50
  2. number of values inside sublists is the same for all sublists (rectangular form) and should be >= 0 and <= 5
  3. values inside sublists cannot be more than or equal to the total number of sublists. In other words, each value inside a sublist is an integer >= 0 and < N

示例有效输入:

[[0]]
[[2, 1], [2, 0], [3, 1], [1, 0]]
[[1], [0]]

样本无效输入:

[[2]]  # 2 is more than N=1 (total number of sublists)
[[0, 1], [2, 0]]  # 2 is equal to N=2 (total number of sublists)

我正在尝试使用基于属性的测试来处理它,并使用 hypothesis 并试图将我的头脑围绕在 lists()integers() 上,但无法使其工作:

I'm trying to approach it with property-based-testing and generate different valid inputs with hypothesis library and trying to wrap my head around lists() and integers(), but cannot make it work:

  • 使用 lists()min_sizemax_size 参数很容易处理条件 #1
  • 条件 #2 包含在 将策略链接在一起
  • 条件 #3 是我正在努力解决的问题 - 因为,如果我们使用上面示例中的 rectangle_lists,我们没有对父"列表长度的引用integers()
  • the condition #1 is easy to approach with lists() and min_size and max_size arguments
  • the condition #2 is covered under Chaining strategies together
  • the condition #3 is what I'm struggling with - cause, if we use the rectangle_lists from the above example, we don't have a reference to the length of the "parent" list inside integers()

问题:

如何限制子列表中的整数值小于子列表的总数?

How can I limit the integer values inside sublists to be less than the total number of sublists?

我的一些尝试:

from hypothesis import given
from hypothesis.strategies import lists, integers

@given(lists(lists(integers(min_value=0, max_value=5), min_size=1, max_size=5), min_size=1, max_size=50))
def test(l):
    # ...

这个远远不能满足要求 - 列表不是严格的矩形形式,生成的整数值可以超过生成的列表大小.

This one was very far from meeting the requirements - list is not strictly of a rectangular form and generated integer values can go over the generated size of the list.

from hypothesis import given
from hypothesis.strategies import lists, integers

@given(integers(min_value=0, max_value=5).flatmap(lambda n: lists(lists(integers(min_value=1, max_value=5), min_size=n, max_size=n), min_size=1, max_size=50)))
def test(l):
    # ...

这里,#1 和 #2 是满足要求,但整数值可以大于列表的大小 - 不满足要求 #3.

Here, the #1 and #2 are requirements were being met, but the integer values can go larger than the size of the list - requirement #3 is not met.

推荐答案

有一个很好的通用技术,在尝试解决像这样的棘手约束时通常很有用:尝试构建看起来有点像你想要的东西但不是的东西t 满足所有约束,然后将其与修改它的函数组合在一起(例如,通过丢弃坏位或修补不太有效的位)使其满足约束.

There's a good general technique that is often useful when trying to solve tricky constraints like this: try to build something that looks a bit like what you want but doesn't satisfy all the constraints and then compose it with a function that modifies it (e.g. by throwing away the bad bits or patching up bits that don't quite work) to make it satisfy the constraints.

对于您的情况,您可以执行以下操作:

For your case, you could do something like the following:

from hypothesis.strategies import builds, lists, integers

def prune_list(ls):
    n = len(ls)
    return [
       [i for i in sublist if i < n][:5]
       for sublist in ls
    ]

limited_list_strategy = builds(
   prune_list,
   lists(lists(integers(0, 49), average_size=5), max_size=50, min_size=1)
)

我们:

  1. 生成一个看起来大致正确的列表(它是一个整数列表,这些整数与所有可能有效的可能索引在同一范围内).
  2. 从子列表中删除任何无效索引
  3. 截断任何仍包含 5 个以上元素的子列表

结果应满足您需要的所有三个条件.

The result should satisfy all three conditions you needed.

average_size 参数并不是绝对必要的,但在对此进行试验时,我发现它有点太容易产生空的子列表.

The average_size parameter isn't strictly necessary but in experimenting with this I found it was a bit too prone to producing empty sublists otherwise.

ETA:抱歉.我刚刚意识到我误读了您的一个条件 - 这实际上并不能满足您的要求,因为它不能确保每个列表的长度相同.这里有一种方法可以修改它来解决这个问题(它变得有点复杂,所以我改用复合而不是构建):

ETA: Apologies. I've just realised that I misread one of your conditions - this doesn't actually do quite what you want because it doesn't ensure each list is the same length. Here's a way to modify this to fix that (it gets a bit more complicated, so I've switched to using composite instead of builds):

from hypothesis.strategies import composite, lists, integers, permutations


@composite
def limisted_lists(draw):
    ls = draw(
        lists(lists(integers(0, 49), average_size=5), max_size=50, min_size=1)
    )
    filler = draw(permutations(range(50)))
    sublist_length = draw(integers(0, 5))

    n = len(ls)
    pruned = [
       [i for i in sublist if i < n][:sublist_length]
       for sublist in ls
    ]

    for sublist in pruned:
        for i in filler:
            if len(sublist) == sublist_length:
               break
            elif i < n:
               sublist.append(i)
    return pruned

这个想法是我们生成一个填充"列表,为子列表的外观提供默认值(因此它们会倾向于向彼此更相似的方向收缩),然后将子列表的长度绘制为修剪以获得这种一致性.

The idea is that we generate a "filler" list that provides the defaults for what a sublist looks like (so they will tend to shrink in the direction of being more similar to eachother) and then draw the length of the sublists to prune to to get that consistency.

我承认这很复杂.您可能想要使用 RecursivelyIronic 的基于平面图的版本.我更喜欢这个的主要原因是它会更好地收缩,所以你会从中得到更好的例子.

This has got pretty complicated I admit. You might want to use RecursivelyIronic's flatmap based version. The main reason I prefer this over that is that it will tend to shrink better, so you'll get nicer examples out of it.

这篇关于使用假设生成具有自定义值限制的列表列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆