tensorflow/tflearn输入形状 [英] tensorflow/tflearn input shape

查看:81
本文介绍了tensorflow/tflearn输入形状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个lstm-rnn来生成音乐序列.训练数据是一系列大小为4的向量,代表了一些要训练的歌曲中每个音符的各种功能(包括MIDI音符).

从我的阅读看来,我要为每个输入样本执行的操作看起来是,输出样本为下一个大小为4的向量(即,在给定当前音符的情况下,它应该尝试预测下一个音符,并且因为LSTM融合了以前的样本知识.

我正在使用tflearn,因为我对RNN还是很陌生.我有以下代码

net = tflearn.input_data(shape=[None, seqLength, 4])
net = tflearn.lstm(net, 128, return_seq=True)
net = tflearn.dropout(net, 0.5)
net = tflearn.lstm(net, 128)
net = tflearn.dropout(net, 0.5)
net = tflearn.fully_connected(net, 4, activation='softmax')
net = tflearn.regression(net, optimizer='adam',
                     loss='mean_square')

# Training
model = tflearn.DNN(net, tensorboard_verbose=3)
model.fit(trainX, trainY, show_metric=True, batch_size=128)

在此代码之前,我已将trainX和trainY分为长度为20的序列(但我在某处读到,对这样的序列进行训练是实现此目的的一种好方法).

这似乎很好,但是我收到错误ValueError:无法为Tensor u'TargetsData/Y:0'输入形状为((?,4)'的形状(128,16,4)的值

SO:到目前为止,我的假设是输入形状[None,seqLength,4]是对TF [batchLength(由tflearn依次馈入),序列长度,样本特征长度]说的.我不明白的是为什么它说输出的形状错误?我是否对数据序列分割有错误的假设?当我只尝试输入所有数据而不拆分为序列时,因此输入形状为[None,4],TF告诉我LSTM层期望输入形状至少具有3个维度.

我无法理解输入和输出的形状.感觉这应该很简单-我有一组向量输入序列,并且我希望网络尝试预测序列中的下一个序列.很少有网络不具备相当高级的知识,因此我遇到了麻烦.非常感谢任何人都能提供的见解!

解决方案

我解决了这个问题,所以在这里为有相同问题的任何人写答案.这是基于对这些网络如何工作的误解,但是在我阅读过的大多数教程中都假定这是知识,所以其他初学者可能不清楚.

LSTM网络在这些情况下非常有用,因为它们可以考虑输入历史记录.将历史信息提供给LSTM的方法是通过测序,但是每个序列仍会导致一个输出数据点.因此,输入必须为3D形状,而输出仅为2D.

给定整个序列和所需的historyLength,我将输入分为historyLength序列和单个输出向量.这解决了我的形状问题.

I'm trying to create a lstm-rnn to generate sequences of music. The training data is a sequence of vectors of size 4, representing various features (including MIDI note) of each note in some songs to train on.

From my reading, it looks like what I'm trying to do is have for each input sample, the output sample is the next size 4 vector (i.e. it should be trying to predict the next note given the current one, and because of the LSTMs incorporating knowledge of samples that have come before).

I'm using tflearn as I'm still very new to RNNs. I have the following code

net = tflearn.input_data(shape=[None, seqLength, 4])
net = tflearn.lstm(net, 128, return_seq=True)
net = tflearn.dropout(net, 0.5)
net = tflearn.lstm(net, 128)
net = tflearn.dropout(net, 0.5)
net = tflearn.fully_connected(net, 4, activation='softmax')
net = tflearn.regression(net, optimizer='adam',
                     loss='mean_square')

# Training
model = tflearn.DNN(net, tensorboard_verbose=3)
model.fit(trainX, trainY, show_metric=True, batch_size=128)

Before this code I have split the trainX and trainY into sequences of length 20 (arbitrarily, but I read somewhere that training on sequences like this is a good way to do this).

This seems to be fine but I get the error ValueError: Cannot feed value of shape (128, 16, 4) for Tensor u'TargetsData/Y:0', which has shape '(?, 4)'

SO: my assumptions so far is that the input shape [None, seqLength, 4] is saying to TF [batchLength (which gets fed by tflearn sequentially), sequence length, feature length of sample]. What I don't understand is why it's saying the output is the wrong shape? Am I assuming wrongly with the data sequence split? When I just try to feed in all my data without splitting into sequences, so the input shape is [None, 4], TF tells me the LSTM layer expects an input shape with at least 3 dimensions.

I can't get my head round what the shapes of the inputs and outputs should be. It feels like this should be a simple thing -- I have a set of input sequences of vectors and I want the network to try and predict the next one in the sequence. There's very little online that doesn't assume a fairly advanced level of knowledge, so I've hit a brick wall. Really appreciate any insight anyone can give!

解决方案

I solved this, so am writing the answer here for anyone having the same problem. It was based on a mis-understanding of how these networks work, but this is assumed knowledge in most tutorials I've read so may not be clear to other beginners.

The LSTM networks are useful for these situations because they can take into account input history. The way the history is given to the LSTM is through the sequencing, but each sequence still leads to a single output data point. So the input must be of 3D shape, while the output is just 2D.

Given an entire sequence and a desired historyLength, I split the input into sequences of historyLength and a single output vector. This solved my shape problem.

这篇关于tensorflow/tflearn输入形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆