如何将2d数组转换为keras + LSTM需要的格式 [英] How to convert 2d array into format that keras+LSTM needs
问题描述
我有一个特征trainX
的5000 by 9
2d numpy数组,这些特征是一个时间序列的特征.我也有一个1d浮点要素标签trainY
的numpy数组.例如,这正是您scikit-learn
所需的格式.
I have a 5000 by 9
2d numpy array of features trainX
which are the features of a time sequence. I also have a 1d numpy array of floating point feature labels trainY
. This is exactly the format you would need for scikit-learn
for example.
我想将这些与keras + LSTM一起使用.这是我目前的代码:
I would like to use these with keras+LSTM. This is my code at present:
NUM_EPOCHS = 20
model = Sequential()
model.add(LSTM(8, input_shape=(1, window_size)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=NUM_EPOCHS, batch_size=1, verbose=2)
但是,这不起作用,因为keras需要使用另一种格式的trainX
.我已经阅读了手册,但我不明白这是什么.
However this doesn't work as keras needs trainX
in a different format it seems. I have read the manual but I can't understand what this is exactly.
如何将我的数据转换为keras可以接受的格式?
How can I convert my data into a format that keras will accept?
推荐答案
格式为(samples, timeSteps, features)
您有多少个序列?听起来像是一个5000步的序列,对吗?
How many sequences do you have? It sounds like one sequence of 5000 steps, is that right?
则格式为(1,5000,9)
.
如果每个时间步长有一个标签,则标签也应为(1,5000,1)
. (然后使用return_sequences=True
).否则,标签为(1,1)
.
The labels should also be (1,5000,1)
, if you have one label per time step. (Then use return_sequences=True
). Otherwise labels are (1,1)
.
(可选)您可能希望将单个序列分成多个段,例如,在经典的滑动窗口情况下,例如,您想以较少的时间步长获得许多样本,例如(4998,3,1)
,假设您想要一个3步骤窗口.然后标签应遵循:(4998,1)
.
Optionally, you may want to split your single sequence in many segments, in a classical sliding window case, for instance, where you'd have many samples with less time steps, such as (4998,3,1)
, supposing you want a 3-step window. Then the labels should follow: (4998,1)
.
这篇关于如何将2d数组转换为keras + LSTM需要的格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!