Keras 误解了训练数据的形状 [英] Keras misinterprets training data shape

查看:30
本文介绍了Keras 误解了训练数据的形状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的训练数据的形式为 (?,15) where ?是可变长度.

My training data has the form (?,15) where ? is a variable length.

在创建我的模型时,我指定:

When creating my model I specify this:

inp = Input(shape=(None,15))
conv = Conv1D(32,3,padding='same',activation='relu')(inp)
...

我的训练数据的形状为 (35730,?,15).

My training data has the shape (35730,?,15).

在 python 中检查这个我得到:

Checking this in python I get:

X.shape

输出:(35730,)

Outputs: (35730,)

X[0].shape

输出:(513, 15)

Outputs: (513, 15)

当我尝试在训练数据上拟合模型时,出现 ValueError:

When I try to fit my model on my training data I get the ValueError:

Error when checking input: expected input_1 to have 3 dimensions, but got array with shape (35730, 1)

我只能通过对单个样本使用 model.train_on_batch() 来训练我的模型.

I can only train my model by using model.train_on_batch() on a single sample.

我该如何解决这个问题?似乎 keras 认为我的输入数据的形状是 (35730, 1) 而实际上是 (35730, ?, 15)

How can I solve this? It seems like keras thinks the shape of my input data is (35730, 1) when it actually is (35730, ?, 15)

这是 keras 中的错误还是我做错了什么?

Is this a bug in keras or did I do something wrong?

如果重要的话,我正在使用 tensorflow 后端.这是keras 2

I am using the tensorflow backend if that matters. This is keras 2

推荐答案

(根据 OP 对此问题的评论进行了编辑,他们在此处发布了此链接:https://github.com/fchollet/keras/issues/1920)

(Edited, according to OP's comment on this question, where they posted this link: https://github.com/fchollet/keras/issues/1920)

你的 X 不是一个单一的 numpy 数组,它是一个数组数组.(否则它的形状将是 X.shape=(35730,513,15).

Your X is not a single numpy array, it's an array of arrays. (Otherwise its shape would be X.shape=(35730,513,15).

对于 fit 方法,它必须是单个 numpy 数组.由于您的长度可变,因此不能有一个包含所有数据的 numpy 数组,您必须将其划分为更小的数组,每个数组包含相同长度的数据.

It must be a single numpy array for the fit method. Since you have a variable length, you cannot have a single numpy array containing all your data, you will have to divide it in smaller arrays, each array containing data with the same length.

为此,您可能应该按形状创建字典,然后手动循环字典(可能还有其他更好的方法可以做到这一点...):

For that, you should maybe create a dictionary by shape, and loop the dictionary manually (there may be other better ways to do this...):

#code in python 3.5
xByShapes = {}
yByShapes = {}
for itemX,itemY in zip(X,Y):
    if itemX.shape in xByShapes:
        xByShapes[itemX.shape].append(itemX)
        yByShapes[itemX.shape].append(itemY)
    else:
        xByShapes[itemX.shape] = [itemX] #initially a list, because we're going to append items
        yByShapes[itemX.shape] = [itemY]

最后,你循环这本字典进行训练:

At the end, you loop this dictionary for training:

for shape in xByShapes:
    model.fit(
              np.asarray(xByShapes[shape]), 
              np.asarray(yByShapes[shape]),...
              )


遮蔽

或者,您可以使用零或一些虚拟值填充数据,使所有样本具有相同的长度.


Masking

Alternatively, you can pad your data so all samples have the same length, using zeros or some dummy value.

然后在模型中的任何内容之前,您可以添加一个 Masking 层,该层将忽略这些填充的段.(警告:某些类型的图层不支持遮罩)

Then before anything in your model you can add a Masking layer that will ignore these padded segments. (Warning: some types of layer don't support masking)

这篇关于Keras 误解了训练数据的形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆