如何迭代这个 n 维数据集? [英] How to iterate over this n-dimensional dataset?

查看:20
本文介绍了如何迭代这个 n 维数据集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 dataset,它有 4 个维度(目前...),我需要对其进行迭代.

I have a dataset which has 4 dimensions (for now...) and I need to iterate over it.

要访问 dataset 中的值,我这样做:

To access a value in the dataset, I do this:

value = dataset[i,j,k,l]

现在,我可以获得 datasetshape:

Now, I can get the shape for the dataset:

shape = [4,5,2,6]

shape中的值代表维度的长度.

The values in shape represent the length of the dimension.

在给定维数的情况下,我如何迭代数据集中的所有元素?这是一个例子:

How, given the number of dimensions, can I iterate over all the elements in my dataset? Here is an example:

for i in range(shape[0]):
    for j in range(shape[1]):
        for k in range(shape[2]):
            for l in range(shape[3]):
                print('BOOM')
                value = dataset[i,j,k,l]

将来,shape 可能会发生变化.例如,shape 可能有 10 个元素,而不是当前的 4 个.

In the future, the shape may change. So for example, shape may have 10 elements rather than the current 4.

在 Python 3 中是否有一种简洁明了的方式来做到这一点?

Is there a nice and clean way to do this with Python 3?

推荐答案

你可以使用 itertools.product 迭代 笛卡尔积1 个值(在本例中为索引):

You could use itertools.product to iterate over the cartesian product 1 of some values (in this case the indices):

import itertools
shape = [4,5,2,6]
for idx in itertools.product(*[range(s) for s in shape]):
    value = dataset[idx]
    print(idx, value)
    # i would be "idx[0]", j "idx[1]" and so on...

<小时>

但是,如果它是您想要迭代的 numpy 数组,可能更容易使用 np.ndenumerate:


However if it's a numpy array you want to iterate over, it could be easier to use np.ndenumerate:

import numpy as np

arr = np.random.random([4,5,2,6])
for idx, value in np.ndenumerate(arr):
    print(idx, value)
    # i would be "idx[0]", j "idx[1]" and so on...

<小时>

1 您要求澄清 itertools.product(*[range(s) for s in shape]) 的实际作用.所以我会更详细地解释它.


1 You asked for clarification what itertools.product(*[range(s) for s in shape]) actually does. So I'll explain it in more details.

例如你有这个循环:

for i in range(10):
    for j in range(8):
        # do whatever

这也可以用 product 写成:

for i, j in itertools.product(range(10), range(8)):
#                                        ^^^^^^^^---- the inner for loop
#                             ^^^^^^^^^-------------- the outer for loop
    # do whatever

这意味着 product 只是减少 independant for 循环数量的便捷方式.

That means product is just a handy way of reducing the number of independant for-loops.

如果您想将可变数量的 for-loops 转换为 product,您基本上需要两个步骤:

If you want to convert a variable number of for-loops to a product you essentially need two steps:

# Create the "values" each for-loop iterates over
loopover = [range(s) for s in shape]

# Unpack the list using "*" operator because "product" needs them as 
# different positional arguments:
prod = itertools.product(*loopover)

for idx in prod:
     i_0, i_1, ..., i_n = idx   # index is a tuple that can be unpacked if you know the number of values.
                                # The "..." has to be replaced with the variables in real code!
     # do whatever

相当于:

for i_1 in range(shape[0]):
    for i_2 in range(shape[1]):
        ... # more loops
            for i_n in range(shape[n]):  # n is the length of the "shape" object
                # do whatever

这篇关于如何迭代这个 n 维数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆