Keras误解了训练数据的形状 [英] Keras misinterprets training data shape
问题描述
我的训练数据的格式为(?,15)其中?是可变长度.
My training data has the form (?,15) where ? is a variable length.
在创建模型时,我指定以下内容:
When creating my model I specify this:
inp = Input(shape=(None,15))
conv = Conv1D(32,3,padding='same',activation='relu')(inp)
...
我的训练数据的形状为(35730,?,15).
My training data has the shape (35730,?,15).
在python中进行检查,我得到:
Checking this in python I get:
X.shape
输出:(35730,)
Outputs: (35730,)
X[0].shape
输出:(513,15)
Outputs: (513, 15)
当我尝试将模型拟合到训练数据上时,出现ValueError:
When I try to fit my model on my training data I get the ValueError:
Error when checking input: expected input_1 to have 3 dimensions, but got array with shape (35730, 1)
我只能通过在单个样本上使用model.train_on_batch()来训练模型.
I can only train my model by using model.train_on_batch() on a single sample.
我该如何解决?似乎keras认为我的输入数据的形状实际上是(35730,?,15)时的形状是(35730,1)
How can I solve this? It seems like keras thinks the shape of my input data is (35730, 1) when it actually is (35730, ?, 15)
这是keras中的错误还是我做错了什么?
Is this a bug in keras or did I do something wrong?
如果重要的话,我正在使用tensorflow后端.这是keras 2
I am using the tensorflow backend if that matters. This is keras 2
推荐答案
(根据OP对这个问题的评论进行了编辑,他们在其中发布了此链接:
(Edited, according to OP's comment on this question, where they posted this link: https://github.com/fchollet/keras/issues/1920)
您的X
不是单个numpy数组,而是一个数组数组. (否则,其形状将为X.shape=(35730,513,15)
.
Your X
is not a single numpy array, it's an array of arrays. (Otherwise its shape would be X.shape=(35730,513,15)
.
对于fit
方法,它必须是单个numpy数组.由于长度是可变的,因此无法拥有包含所有数据的单个numpy数组,因此必须将其划分为较小的数组,每个数组包含的数据长度均相同.
It must be a single numpy array for the fit
method. Since you have a variable length, you cannot have a single numpy array containing all your data, you will have to divide it in smaller arrays, each array containing data with the same length.
为此,您应该按形状创建字典,然后手动循环字典(可能还有其他更好的方法...):
For that, you should maybe create a dictionary by shape, and loop the dictionary manually (there may be other better ways to do this...):
#code in python 3.5
xByShapes = {}
yByShapes = {}
for itemX,itemY in zip(X,Y):
if itemX.shape in xByShapes:
xByShapes[itemX.shape].append(itemX)
yByShapes[itemX.shape].append(itemY)
else:
xByShapes[itemX.shape] = [itemX] #initially a list, because we're going to append items
yByShapes[itemX.shape] = [itemY]
最后,您循环这本词典进行培训:
At the end, you loop this dictionary for training:
for shape in xByShapes:
model.fit(
np.asarray(xByShapes[shape]),
np.asarray(yByShapes[shape]),...
)
掩盖
或者,您可以填充数据,以使所有样本都具有相同的长度(使用零或一些虚拟值).
Masking
Alternatively, you can pad your data so all samples have the same length, using zeros or some dummy value.
然后在模型中的任何内容之前,可以添加一个Masking
层,该层将忽略这些填充的段. (警告:某些类型的图层不支持遮罩)
Then before anything in your model you can add a Masking
layer that will ignore these padded segments. (Warning: some types of layer don't support masking)
这篇关于Keras误解了训练数据的形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!