在Keras中平均批次尺寸 [英] Averaging over the batch dimension in Keras

查看:99
本文介绍了在Keras中平均批次尺寸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到一个问题,我想预测一个时间序列和多个时间序列.我的输入是(batch_size, time_steps, features),我的输出应该是(1, time_steps, features)

我不知道如何对N求平均值.

这是一个虚拟的例子.首先,虚拟数据的输出是200个时间序列的线性函数:

import numpy as np
time = 100
N = 2000

dat = np.zeros((N, time))
for i in range(time):
    dat[i,:] = np.sin(list(range(time)))*np.random.normal(size =1) + np.random.normal(size = 1)

y = dat.T @ np.random.normal(size = N)

现在,我将定义一个时间序列模型(使用一维转换网络):

from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Lambda
from keras.optimizers import Adam
from keras import backend as K

n_filters = 2
filter_width = 3
dilation_rates = [2**i for i in range(5)] 
inp = Input(shape=(None, 1))
x = inp
for dilation_rate in dilation_rates:
    x = Conv1D(filters=n_filters,
               kernel_size=filter_width, 
               padding='causal',
               activation = "relu",
               dilation_rate=dilation_rate)(x)
x = Dense(1)(x)

model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.predict(dat.reshape(N, time, 1)).shape

Out[43]: (2000, 100, 1)

输出的形状错误!接下来,我尝试使用平均层,但是出现了这个奇怪的错误:

def av_over_batches(x):
    x = K.mean(x, axis = 0)
    return(x)

x = Lambda(av_over_batches)(x)

model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.predict(dat.reshape(N, time, 1)).shape

Traceback (most recent call last):

  File "<ipython-input-3-d43ccd8afa69>", line 4, in <module>
    model.predict(dat.reshape(N, time, 1)).shape

  File "/home/me/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1169, in predict
    steps=steps)

  File "/home/me/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 302, in predict_loop
    outs[i][batch_start:batch_end] = batch_out

ValueError: could not broadcast input array from shape (100,1) into shape (32,1)

32来自哪里? (顺便说一句,我在真实数据中得到的数字是相同的,而不仅仅是在MWE中).

但是主要问题是:我如何建立一个在输入批次维度上平均的网络?

解决方案

我将以另一种方式解决问题

问题:您想从一组时间序列中预测一个时间序列.因此,假设您要预测时间序列y1, y2, y3的100个时间步中有3个时间序列值TS1, TS2, TS3.

我针对此问题的处理方法如下

即将每个时间步的时间序列归为一组,并将其输入LSTM.如果某些时间步长短一些,那么您可以填补它们.同样,如果某些集合的时间序列较少,则再次填充它们.

示例:

import numpy as np
np.random.seed(33)

time = 100
N = 5000
k = 5

magic = np.random.normal(size = k)

x = list()
y = list()
for i in range(N):
    dat = np.zeros((k, time))
    for i in range(k):
        dat[i,:] = np.sin(list(range(time)))*np.random.normal(size =1) + np.random.normal(size = 1)
    x.append(dat)
    y.append(dat.T @ magic)

所以我想从一组3次步骤中预测100个步骤的时间序列.我们要在模型中学习magic.

from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Lambda, LSTM
from keras.optimizers import Adam
from keras import backend as K
import matplotlib.pyplot as plt

input = Input(shape=(time, k))
lstm = LSTM(32, return_sequences=True)(input)
output = Dense(1,activation='sigmoid')(lstm)

model = Model(inputs = input, outputs = output)
model.compile(optimizer = Adam(), loss='mean_squared_error')

data_x = np.zeros((N,100,5))
data_y = np.zeros((N,100,1))

for i in range(N):
    data_x[i] = x[i].T.reshape(100,5)
    data_y[i] = y[i].reshape(100,1)

from sklearn.preprocessing import StandardScaler

ss_x = StandardScaler()
ss_y = StandardScaler()

data_x = ss_x.fit_transform(data_x.reshape(N,-1)).reshape(N,100,5)
data_y = ss_y.fit_transform(data_y.reshape(N,-1)).reshape(N,100,1)

# Lets leave the last one sample for testing rest split into train and validation
model.fit(data_x[:-1],data_y[:-1], batch_size=64, nb_epoch=100, validation_split=.25)

val损失仍在下降,但我停止了.让我们看看我们的预测有多好

y_hat = model.predict(data_x[-1].reshape(-1,100,5))
plt.plot(data_y[-1], label='y')
plt.plot(y_hat.reshape(100), label='y_hat')
plt.legend(loc='upper left')

结果令人鼓舞.将其运行更多的时间段以及超级参数调整应进一步使我们关闭magic.人们还可以尝试堆叠式LSTM和双向LSTM.

我觉得RNN更适合时间序列数据,而不是CNN.

数据格式: 假设时间步长= 3 时间序列1 = [1,2,3]

时间序列2 = [4,5,6]

时间序列3 = [7,8,9]

时间序列3 = [10,11,12]

Y = [100,200,300]

批量为1

[[1,4,7,10],[2,5,8,11],[3,6,9,12]] -> LSTM -> [100,200,300]

I've got a problem where I want to predict one time series with many time series. My input is (batch_size, time_steps, features) and my output should be (1, time_steps, features)

I can't figure out how to average over N.

Here's a dummy example. First, dummy data where the output is a linear function of 200 time series:

import numpy as np
time = 100
N = 2000

dat = np.zeros((N, time))
for i in range(time):
    dat[i,:] = np.sin(list(range(time)))*np.random.normal(size =1) + np.random.normal(size = 1)

y = dat.T @ np.random.normal(size = N)

Now I'll define a time series model (using 1-D conv nets):

from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Lambda
from keras.optimizers import Adam
from keras import backend as K

n_filters = 2
filter_width = 3
dilation_rates = [2**i for i in range(5)] 
inp = Input(shape=(None, 1))
x = inp
for dilation_rate in dilation_rates:
    x = Conv1D(filters=n_filters,
               kernel_size=filter_width, 
               padding='causal',
               activation = "relu",
               dilation_rate=dilation_rate)(x)
x = Dense(1)(x)

model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.predict(dat.reshape(N, time, 1)).shape

Out[43]: (2000, 100, 1)

The output is the wrong shape! Next, I tried using an averaging layer, but I get this weird error:

def av_over_batches(x):
    x = K.mean(x, axis = 0)
    return(x)

x = Lambda(av_over_batches)(x)

model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.predict(dat.reshape(N, time, 1)).shape

Traceback (most recent call last):

  File "<ipython-input-3-d43ccd8afa69>", line 4, in <module>
    model.predict(dat.reshape(N, time, 1)).shape

  File "/home/me/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1169, in predict
    steps=steps)

  File "/home/me/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 302, in predict_loop
    outs[i][batch_start:batch_end] = batch_out

ValueError: could not broadcast input array from shape (100,1) into shape (32,1)

Where does 32 come from? (Incidentally, I got the same number in my real data, not just in the MWE).

But the main question is: how can I build a network that averages over the input batch dimension?

解决方案

I would approach the problem in a different way

Problem: You want to predict a time series from a set of time series. so lets say you have 3 time series value TS1, TS2, TS3 each of 100 time steps you want to predict a time series y1, y2, y3.

My approach for this problem will be as below

i.e group the times series each time step together and feed it to an LSTM. If some time steps are shorter then others them you can pad them. Similarly if some sets have fewer time series then again pad them.

Example:

import numpy as np
np.random.seed(33)

time = 100
N = 5000
k = 5

magic = np.random.normal(size = k)

x = list()
y = list()
for i in range(N):
    dat = np.zeros((k, time))
    for i in range(k):
        dat[i,:] = np.sin(list(range(time)))*np.random.normal(size =1) + np.random.normal(size = 1)
    x.append(dat)
    y.append(dat.T @ magic)

So I want to predict a timeseries of 100 steps from a set of 3 times steps. We want to the model to learn the magic.

from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Lambda, LSTM
from keras.optimizers import Adam
from keras import backend as K
import matplotlib.pyplot as plt

input = Input(shape=(time, k))
lstm = LSTM(32, return_sequences=True)(input)
output = Dense(1,activation='sigmoid')(lstm)

model = Model(inputs = input, outputs = output)
model.compile(optimizer = Adam(), loss='mean_squared_error')

data_x = np.zeros((N,100,5))
data_y = np.zeros((N,100,1))

for i in range(N):
    data_x[i] = x[i].T.reshape(100,5)
    data_y[i] = y[i].reshape(100,1)

from sklearn.preprocessing import StandardScaler

ss_x = StandardScaler()
ss_y = StandardScaler()

data_x = ss_x.fit_transform(data_x.reshape(N,-1)).reshape(N,100,5)
data_y = ss_y.fit_transform(data_y.reshape(N,-1)).reshape(N,100,1)

# Lets leave the last one sample for testing rest split into train and validation
model.fit(data_x[:-1],data_y[:-1], batch_size=64, nb_epoch=100, validation_split=.25)

The val loss was going down still but I stoped it. Lets see how good our prediction is

y_hat = model.predict(data_x[-1].reshape(-1,100,5))
plt.plot(data_y[-1], label='y')
plt.plot(y_hat.reshape(100), label='y_hat')
plt.legend(loc='upper left')

The results are promising. Running it for more epochs and also hyper parameter tuning should further bring us close the the magic. One can also try stacked LSTM and bi-directional LSTM.

I feel RNNs are better suited for time series data rather then CNN's

Data Format: Lets say time steps = 3 Time series 1 = [1,2,3]

Time series 2 = [4,5,6]

Time series 3 = [7,8,9]

Time series 3 = [10,11,12]

Y = [100,200,300]

For a batch size of 1

[[1,4,7,10],[2,5,8,11],[3,6,9,12]] -> LSTM -> [100,200,300]

这篇关于在Keras中平均批次尺寸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆