如何将分类时间序列数据输入LSTM [英] How to input a classification time series data into LSTM

查看:588
本文介绍了如何将分类时间序列数据输入LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将我的数据输入LSTM网络,但是找不到任何类似的问题或教程.我的数据集是这样的:

I want to feed my data into a LSTM network, but can't find any similar question or tutorial. My dataset is something like:

person 1:
    t1 f1 f2 f3
    t2 f1 f2 f3
     ...
    tn f1 f2 f3
.
.
.

person K:
    t1 f1 f2 f3
    t2 f1 f2 f3
     ...
    tn f1 f2 f3

所以我有一个k人,对于每个人我都有一个类似输入的矩阵.每行的第一列是增量时间戳(类似于时间线,因此t1< t2),其他列是该时间中人员的特征.

So i have k person and for each person i have a matrix like input. The first column of each row is incremental time stamp (like a time-line, so t1 < t2) and other columns are features of person in that time.

在数学方面:我有一个(number of example,number of time stamp, number of feature)矩阵,如(52,20,4),其中52是人数,20是一个人的时间戳数,4是特征数(1列是时间戳)和3是功能)

In mathematical aspect: i have a (number of example,number of time stamp, number of feature) matrix like (52,20,4) which 52 is number of persons, 20 is number of time stamps for a person and 4 is number of features( 1 column is time stamp and 3 are features)

每个人都有一个班级名称.我想使用LSTM神经网络将此人分为两类.我的问题是如何在Keras等高级库中将此类数据输入LSTM?

Each person has a class name. I want to classify this persons into two class using LSTM neural network. My question is how to input this type of data into LSTM in a high level library such as Keras?

修改: 我的第一次尝试是在keras中将其用作input_shape,但是我在二进制分类中获得了50%的精度!是我的数据集中的问题还是input_shape是错误的?!

My first attempt is to use this as input_shape in keras, but i get 50% accuracy in binary classification! Is the problem in my dataset or input_shape is wrong?!

LSTM(5,input_shape=(20,4))

推荐答案

您需要用feature vector表示每个人的数据,并将此向量传递到分类器中(例如

You need to represent each person's data with a feature vector and pass this vector into the classifier (e.g. MLP classifier). I guess your question might be how to get the feature vector out of raw data? There are many ways to get feature out of time-series data. In your case, LSTM would be an option.

LSTM 需要一个3d矢量,其形状为[batch_size x time x feature].正如您在问题中提到的,您可以使用以下方法将数据输入模型:

LSTM needs a 3D vector for its input with the shape of[batch_size x time x feature]. As you mentioned in the question, you can feed data into the model with:

model = Sequential()
model.add(LSTM(5, input_shape=(20, 4))
model.add(Dense(2, activation='sigmoid')

1)我猜tf值相差很大,并且没有归一化.结果,对LSTM的预测并不令人印象深刻.

1) I guess t and f values vary widely and are not normalized. As a result, the prediction of LSTM is not impressive.

2)您的数据集相对较小.为了找出问题所在,请在训练数据的一小部分上对模型进行过度拟合.如果您在训练数据上获得100%的准确性,则意味着您的LSTM学会了很好地表示特征向量.否则,这意味着您设计的模型不正确或数据输入不正确.

2) Your dataset is relatively small. To find out the issue, overfit the model on a small subset of training data. If you get the accuracy of 100% on training data then it means your LSTM learned to represent feature vectors very well. Otherwise, it implies you do not design a good model or feed data properly.

这篇关于如何将分类时间序列数据输入LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆