Keras LSTM训练数据格式 [英] Keras LSTM training data format

查看:1079
本文介绍了Keras LSTM训练数据格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用LSTM神经网络(使用Keras)来预测对手在石头剪刀布游戏中的下一步行动.

I am trying to use LSTM neural network (using Keras) to predict opponent's next move in the game Rock-Paper-Scissor.

我已将输入编码为Rock:[1 0 0],Paper:[0 1 0],剪刀:[0 0 1].现在我想训练神经网络,但是我对训练数据的数据结构有些困惑.

I have encode the inputs as Rock: [1 0 0], Paper: [0 1 0], Scissor: [0 0 1]. Now I want to train the neural network but I am a bit confused of the data structure of my training data.

我已经将对手的比赛历史存储在具有以下结构的.csv文件中:

I have stored an opponent's game history in a .csv file with the following structure:

1,0,0
0,1,0
0,1,0
0,0,1
1,0,0
0,1,0
0,1,0
0,0,1
1,0,0
0,0,1

我正在尝试将每5个数据用作训练标签,并将前4个数据用作训练输入.换句话说,在每个时间步长,维度为3的向量都将被发送到网络,而我们有4个时间步长.

And I am trying to use every 5th data as my training label, and the previous 4 data as the training input. In another word, at each time step, a vector with dimension 3 is sent to the network, and we have 4 time steps.

例如,以下是输入数据

1,0,0
0,1,0
0,1,0
0,0,1

第五个是训练标签

1,0,0

我的问题是Keras的LSTM网络接受哪种类型的数据格式?为此目的,重新安排我的数据的最佳方式是什么?如果有帮助,请附加以下不完整的代码:

My question is what type of data format does Keras' LSTM network accept? What would be an optimum way to rearrange my data for this purpose? My incomplete code is attached as following if it helps:

#usr/bin/python
from __future__ import print_function

from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM
from keras.optimizers import Adam

output_dim = 3
input_dim = 3
input_length = 4
batch_size = 20   #use all the data to train in one iteration


#each input has such strcture
#Rock: [1 0 0], Paper: [0 1 0], Scissor: [0 0 1]
#4 inputs (vectors) are sent to the LSTM net and output 1 vector as the prediction

#incomplete function
def read_data():
    raw_training = np.genfromtxt('training_data.csv',delimiter=',')




    print(raw_training)

def createNet(summary=False):
    print("Start Initialzing Neural Network!")
    model = Sequential()
    model.add(LSTM(4,input_dim=input_dim,input_length=input_length,
            return_sequences=True,activation='softmax'))
    model.add(Dropout(0.1))
    model.add(LSTM(4,
            return_sequences=True,activation='softmax'))
    model.add(Dropout(0.1))
    model.add(Dense(3,activation='softmax'))
    model.add(Dropout(0.1))
    model.add(Dense(3,activation='softmax'))
    model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
    if summary:
        print(model.summary())
    return model

if __name__=='__main__':
    createNet(True)

推荐答案

LSTM的输入格式应具有形状(sequence_length,input_dim). 因此,在您的情况下,应使用形状为(4,3)的numpy数组.

The input format for the LSTM should have a shape (sequence_length, input_dim). So in your case, numpy arrays of shape (4,3) should do it.

您将提供给模型的是一个numpy形状的数组(number_of_train_examples,sequence_length,input_dim). 换句话说,您将输入形状为(4,3)的number_of_train_examples个表格. 建立以下列表:

What you will feed to the model will then be a numpy array of shape (number_of_train_examples, sequence_length, input_dim). In other words, you will feed number_of_train_examples tables of shape (4,3). Build a list of :

1,0,0
0,1,0
0,1,0
0,0,1

然后执行np.array(list_of_train_example).

and then do np.array(list_of_train_example).

但是,我不明白为什么您要为第二个LSTM返回整个序列?它将向您输出形状为(4,4)的东西,Dense层可能会因此失败.返回序列意味着您将返回整个序列,因此在LSTM的每个步骤中每个隐藏的输出都将返回.对于第二个LSTM,我将其设置为False,以便仅获得形状(4,)的摘要"矢量,您的Dense层可以读取该矢量. 无论如何,即使对于第一个LSTM,这也意味着输入形状(4,3)时,您将输出形状(4,4)的东西,因此,与该层的输入数据相比,您将拥有更多的参数...真的很好.

However, I don't understand why you return the whole sequence for the second LSTM? It will output you something with the shape (4,4), the Dense layer will probably fail on that. Return sequence means that you will return the whole sequence, so every hidden output at each step of LSTM. I would set this to False for the second LSTM to only get a "summary" vector of shape (4,) that your Dense layer can read. Anyway, even for the first LSTM it means that with an input of shape (4,3), you output something which has shape (4,4), so you will have more parameters than input data for this layer... Can't be really good.

关于激活,我也将使用softmax,但仅在最后一层上,softmax用于获取概率作为该层的输出.在最后一个LSTM和Dense中使用softmax并没有多大意义.寻求其他非线性,例如"Sigmoid"或"tanh".

Regarding the activations, I would also use softmax but only on the last layer, softmax is used to get probabilities as output of the layer. It doesn't make really sense to use a softmax out of LSTM's and the Dense before the last. Go for some other non linearity like "sigmoid" or "tanh".

这就是我要在模型上做的事情

This is what I would do model-wise

def createNet(summary=False):
    print("Start Initialzing Neural Network!")
    model = Sequential()
    model.add(LSTM(4,input_dim=input_dim,input_length=input_length,
            return_sequences=True,activation='tanh'))
    model.add(Dropout(0.1))
    # output shape : (4,4)
    model.add(LSTM(4,
            return_sequences=False,activation='tanh'))
    model.add(Dropout(0.1))
    # output shape : (4,)
    model.add(Dense(3,activation='tanh'))
    model.add(Dropout(0.1))
    # output shape : (3,)
    model.add(Dense(3,activation='softmax'))
    # output shape : (3,)
    model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
    if summary:
        print(model.summary())
    return model

这篇关于Keras LSTM训练数据格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆