从列表创建numpy数组会产生错误的形状 [英] Creating numpy array from list gives wrong shape
问题描述
我正在从numpy数组列表中创建多个numpy数组,如下所示:
I'm creating several numpy arrays from a list of numpy arrays, like so:
seq_length = 1500
seq_diff = 200 # difference between start of two sequences
# x and y are 2D numpy arrays
x_seqs = [x[i:i+seq_length,:] for i in range(0, seq_diff*(len(x) // seq_diff), seq_diff)]
y_seqs = [y[i:i+seq_length,:] for i in range(0, seq_diff*(len(y) // seq_diff), seq_diff)]
boundary1 = int(0.7 * len(x_seqs)) # 70% is training set
boundary2 = int(0.85 * len(x_seqs)) # 15% validation, 15% test
x_train = np.array(x_seqs[:boundary1])
y_train = np.array(y_seqs[:boundary1])
x_valid = np.array(x_seqs[boundary1:boundary2])
y_valid = np.array(y_seqs[boundary1:boundary2])
x_test = np.array(x_seqs[boundary2:])
y_test = np.array(y_seqs[boundary2:])
我想最终得到6个形状数组(n,1500、300),其中n分别是我用于训练,验证和测试数组的数据的70%,15%或15%.
I'd like to end up with 6 arrays of shape (n, 1500, 300) where n is either 70%, 15% or 15% of my data for the training, validation and test arrays, respectively.
这是出问题的地方:_train
和_valid
数组结果很好,但是_test
数组是数组的一维数组.那是:
This is where it goes wrong: the _train
and _valid
arrays turn out fine, but the _test
arrays are one-dimensional arrays of arrays. That is:
-
x_train.shape
是(459, 1500, 300)
-
x_valid.shape
是(99, 1500, 300)
-
x_test.shape
是(99,)
x_train.shape
is(459, 1500, 300)
x_valid.shape
is(99, 1500, 300)
x_test.shape
is(99,)
但是打印x_test
会验证它包含正确的元素-即,它是一个99个元素的(1500, 300)
数组长数组.
But printing x_test
verifies that it contains the correct elements - i.e. it's a 99-element long array of (1500, 300)
arrays.
为什么_test
矩阵的形状错误,而_train
和_valid
矩阵却没有?
Why do the _test
matrices get the wrong shape, while the _train
and _valid
matrices don't?
推荐答案
x_seqs
中的项目长度不同.当它们都具有相同的长度时,np.array
可以从它们组成一个3d数组.当它们不同时,它将构成一个对象列表数组.查看x_test
的dtype
.看看[len(i) for i in x_test]
.
The items in x_seqs
vary in length. When they are all the same length, np.array
can make a 3d array from them; when they differ it makes an object array of lists. Look at the dtype
of x_test
. Look at the [len(i) for i in x_test]
.
我接受了您的代码,并补充:
I took your code, added:
x=np.zeros((2000,10))
y=x.copy()
...
print([len(i) for i in x_seqs])
print(x_train.shape)
print(x_valid.shape)
print(x_test.shape)
并得到:
1520:~/mypy$ python3 stack40643639.py
[1500, 1500, 1500, 1400, 1200, 1000, 800, 600, 400, 200]
(7,)
(1, 600, 10)
(2,)
这篇关于从列表创建numpy数组会产生错误的形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!