numpy函数`array_split`在数学上如何工作? [英] How does the numpy function `array_split` work mathematically?

查看:157
本文介绍了numpy函数`array_split`在数学上如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要编写一个Python函数,当传递一个数组和一个整数N时,该函数返回该数组的内容,该内容分为相等大小的N个子数组.

I need to write a Python function that when passed an array, and an integer N, returns the contents of the array divided into N sub-arrays of equal size.

如果不能将数组的长度除以N,则最终的子数组必须具有合适的长度以容纳其余元素.

If the length of the array cannot be divided equally by N, the final sub-arrays must be of suitable length to accommodate the remaining elements.

示例: split_array(array=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], n=4)

应输出:[[1, 2, 3], [4, 5, 6], [7, 8], [9, 10]]

我的研究表明, numpy.array_split 函数确实做到了这一点,我查看了 GitHub ,发现它首先组成一个包含子数组所有大小的数组,然后对其进行迭代以拆分原始数组.

My research indicated that the numpy.array_split function does exactly that and I looked at the source code on GitHub and found that first it composes an array containing all the sizes of the sub-arrays which it then iterates over to split the original array.

来自numpy.array_split的删节样本

def array_split(ary, indices_or_sections, axis=0):
    # indices_or_sections is a scalar, not an array.
    Nsections = int(indices_or_sections)
    if Nsections <= 0:
        raise ValueError('number sections must be larger than 0.')
    Neach_section, extras = divmod(Ntotal, Nsections)
    section_sizes = ([0] +
                     extras * [Neach_section+1] +
                     (Nsections-extras) * [Neach_section])
    div_points = _nx.array(section_sizes, dtype=_nx.intp).cumsum()

    sub_arys = []
    sary = _nx.swapaxes(ary, axis, 0)
    for i in range(Nsections):
        st = div_points[i]
        end = div_points[i + 1]
        sub_arys.append(_nx.swapaxes(sary[st:end], axis, 0))

    return sub_arys

我唯一要理解的是变量section_sizes是如何数学创建的. 对于示例split_array(array=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], n=4),它构建了一个尺寸列表,该尺寸列表正是[3, 3, 2, 2],这正是我所需要的,但是我不明白为什么它起作用.

The only thing I'm struggling to understand is how the variable section_sizes is created mathematically. For the example split_array(array=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], n=4) it builds a list of sizes which would be [3, 3, 2, 2] which is exactly what I need but I don't understand why it works.

我知道divmod(Ntotal, Nsections)将为您提供除法计算的商(Neach_section)和余数(extras).

I understand that divmod(Ntotal, Nsections) will give you the quotient(Neach_section) and remainder(extras) of a division calculation.

但是为什么quotient * [remainder+1]总是给您正确数量的正确大小的商"子数组大小(在本示例[3,3]中)?

But why does quotient * [remainder+1] always give you the exact right number of correctly-sized "quotient" sub-array sizes (In the case of this example [3, 3])?

为什么[quotient-remainder] * quotient给您正确数目的正确大小的剩余"子数组大小(在本示例中为[2,2])?

Why does [quotient-remainder] * quotient give you the exact right number of correctly-sized "remainder" sub-array sizes (In the case of this example [2, 2])?

有人甚至可以告诉我这种运算是什么,或者它涉及的是数学的哪个分支,因为这不是我以前遇到过的.

Could someone even just tell me what this kind of operation is called or what branch of mathematics this deals with as it's not something I've come across before.

推荐答案

为清楚起见,我将参考以下内容:

For clarity, I'll refer to this:

Neach_section, extras = divmod(Ntotal, Nsections)
section_sizes = ([0] +
                 extras * [Neach_section+1] +
                 (Nsections-extras) * [Neach_section])

quotient, remainder = divmod(Ntotal, Nsections)
section_sizes = ([0] +
                 remainder * [quotient+1] +
                 (Nsections- remainder) * [quotient])

首先,让我们想象一下与您的问题中所示情况类似的情况. (修改为商!=余数)

First lets imagine a similar case to the one shown in your question. (Modified for Quotient != remainder)

print(np.array_split(np.arange(1,15),4) 
>>>[array([1, 2, 3, 4]), array([5, 6, 7, 8]), array([ 9, 10, 11]), array([12, 13, 14])]

从最终代表的分歧的角度考虑它更容易.

Its easier to think it in terms of the division that this ultimately represents.

14 = 4 * 3 + 2

14 = 4*3 + 2

14 =(3 + 3 + 3 + 3)+ 2

14 = (3 + 3 + 3 + 3) + 2

=(3 + 3 + 3 + 3)+(1 +1)

= (3 + 3 + 3 + 3) + (1 + 1)

并且至关重要的是,我们可以将这些内容添加到第一个括号中的前两个术语中.

And critically we can add those ones to the first two terms in the first bracket.

14 = 4 + 4 + 3 + 3

14 = 4 + 4 + 3 + 3

通常,我们要做的是在输出列表的第一个(剩余)项中添加一个,剩下代码片段

In general what we've done is we're adding one to the first (Remainder) terms of the output list leaving us with the snippet of code

...remainder * [quotient+1]...

在输出中的(商)项中,我们添加了第一个(余数)项,从而为我们填充了下一个(商数-余数)项

Out of the (Quotient) terms in the output we have added the first (remainder) terms leaving us with the next (quotient-remainder) terms to fill

...(Nsections- remainder) * [quotient])

留下最终代码.

有人甚至可以告诉我这种运算是什么,或者它涉及的是数学的哪个分支,因为这不是我以前遇到过的.

Could someone even just tell me what this kind of operation is called or what branch of mathematics this deals with as it's not something I've come across before.

我相信这与数论之间存在松散的关系,商-余数定理可能是您首先学到的东西之一.

I believe this is loosely related to number theory and the quotient-remainder theorem is probably one of the first things you learn for it.

无论如何,我希望对您有所帮助:)

Anyway I hope that helped :)

这篇关于numpy函数`array_split`在数学上如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆