为什么自动编码器与编码器+解码器的预测有所不同? [英] Why do predictions differ for Autoencoder vs. Encoder + Decoder?

查看:113
本文介绍了为什么自动编码器与编码器+解码器的预测有所不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我按照在这个问题中,编码器和解码器是分开的.我的目标是在训练自动编码器后重新使用解码器.我的自动编码器的中央层是Dense层,因为我想在以后学习.

I build a CNN 1d Autoencoder in Keras, following the advice in this SO question, where Encoder and Decoder are separated. My goal is to re-use the decoder, once the Autoencoder has been trained. The central layer of my Autoencoder is a Dense layer, because I would like to learn it afterwards.

我的问题是,如果我编译并拟合整个自动编码器(写为Decoder()Encoder()(x),其中x是输入),则在执行该操作时会得到不同的预测

My problem is that if I compile and fit the whole Autoencoder, written as Decoder()Encoder()(x) where x is the input, I get a different prediction when I do

autoencoder.predict(training_set)

autoencoder.predict(training_set)

w.r.t.如果我先用一组中心特征对训练集进行编码,然后让解码器对其进行解码.一旦对自动编码器进行了训练,这两种方法应该给出相同的答案.

w.r.t. if I first encode the training set in a set of central features, and then let the decoder decode them. These two approaches should give identical answers, once the Autoencoder has been trained.

from tensorflow.keras.layers import Input, Dense, BatchNormalization, Flatten, Lambda, Activation, Conv1D, MaxPooling1D, UpSampling1D, Reshape
from tensorflow.keras.models import Model
from tensorflow.keras import optimizers
from tensorflow.keras.layers import GaussianNoise
import keras.backend as K
from tensorflow.keras.layers import Add

import tensorflow as tf

import scipy.io
import sys
import matplotlib.pyplot as plt
import numpy as np
import copy


training = # some training set, 1500 samples of 501 point each
testing = # some testing set, 500 samples of 501 point each

# reshaping for CNN
training = np.reshape(training, [1500, 501, 1])
testing = np.reshape(testing, [500, 501, 1])


# normalize input
X_mean = training.mean()
oscillations -= X_mean
X_std = training.std()
training /= X_std


copy_of_test = copy.copy(testing)
testing -= X_mean
testing /= X_std

### MODEL ###

def Encoder():
    encoder_input = Input(batch_shape=(None, 501, 1))  
    e1 = Conv1D(256,3, activation='tanh', padding='valid')(encoder_input)
    e2 = MaxPooling1D(2)(e1)
    e3 = Conv1D(32,3, activation='tanh', padding='valid')(e2)
    e4 = MaxPooling1D(2)(e3)
    e5 = Flatten()(e4)
    encoded = Dense(32,activation = 'tanh')(e5)
    return Model(encoder_input, encoded)


def Decoder():
    encoded_input = Input(shape=(32,))  
    encoded_reshaped = Reshape((32,1))(encoded_input)
    d1 = Conv1D(32, 3, activation='tanh', padding='valid', name='decod_conv1d_1')(encoded_reshaped)
    d2 = UpSampling1D(2, name='decod_upsampling1d_1')(d1)
    d3 = Conv1D(256, 3, activation='tanh', padding='valid', name='decod_conv1d_2')(d2)
    d4 = UpSampling1D(2, name='decod_upsampling1d_2')(d3)
    d5 = Flatten(name='decod_flatten')(d4)
    d6 = Dense(501, name='decod_dense1')(d5)
    decoded = Reshape((501,1), name='decod_reshape')(d6)
    return Model(encoded_input, decoded)


# define input to the model:
x = Input(batch_shape=(None, 501, 1))
y = Input(shape=(32,))

# make the model:
autoencoder = Model(x, Decoder()(Encoder()(x)))

# compile the model:
autoencoder.compile(optimizer='adam', loss='mse')
for layer in autoencoder.layers: print(K.int_shape(layer.output))


epochs = 100
batch_size = 100
validation_split = 0.2
# train the model
history = autoencoder.fit(x = training, y = training,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_split=validation_split)

# Encoder
encoder = Model(inputs=x, outputs=Encoder()(x), name='encoder')
print('enc:')
for layer in encoder.layers: print(K.int_shape(layer.output))
features = encoder.predict(training) # features

# Decoder
decoder = Model(inputs=y, outputs=Decoder()(y), name='decoder')
print('dec:')
for layer in decoder.layers: print(K.int_shape(layer.output))
score = decoder.predict(features) # 
score = np.squeeze(score)    

predictions = autoencoder.predict(training)
predictions = np.squeeze(predictions)

# plotting one random case
# score should be equal to predictions!
# because score is obtained from the trained decoder acting on the encoded features, while predictions are obtained form the Autoencoder acting on the training set 
plt.plot(score[100], label='eD')
plt.plot(predictions[100], label='AE')
plt.legend()
plt.show()
plt.close()

编辑,遵循OverLordGoldDragon的答案:

EDIT following OverLordGoldDragon's answer:

我在答案中实现了建议,将以下内容写在同一文件中:

I implemented the suggestion in the answer, writing the following in the same file:

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")


def Encoder():
    encoder_input = Input(batch_shape=(None, 501, 1))  
    e1 = Conv1D(256,3, activation='tanh', padding='valid')(encoder_input)
    e2 = MaxPooling1D(2)(e1)
    e3 = Conv1D(32,3, activation='tanh', padding='valid')(e2)
    e4 = MaxPooling1D(2)(e3)
    e5 = Flatten()(e4)
    encoded = Dense(32,activation = 'tanh')(e5)
    encoded = Reshape((32,1))(encoded)
    return Model(encoder_input, encoded)


def Decoder():
    encoded_input = Input(shape=(32,))  
    encoded_reshaped = Reshape((32,1))(encoded_input)
    d1 = Conv1D(32, 3, activation='tanh', padding='valid', name='decod_conv1d_1')(encoded_reshaped)
    d2 = UpSampling1D(2, name='decod_upsampling1d_1')(d1)
    d3 = Conv1D(256, 3, activation='tanh', padding='valid', name='decod_conv1d_2')(d2)
    d4 = UpSampling1D(2, name='decod_upsampling1d_2')(d3)
    d5 = Flatten(name='decod_flatten')(d4)
    d6 = Dense(501, name='decod_dense1')(d5)
    decoded = Reshape((501,1), name='decod_reshape')(d6)
    return Model(encoded_input, decoded)


def DecoderAE(encoder_input, encoded_input):
    encoded_reshaped = Reshape((32,1))(encoded_input)
    d1 = Conv1D(32, 3, activation='tanh', padding='valid',
                       name='decod_conv1d_1')(encoded_reshaped)
    d2 = UpSampling1D(2, name='decod_upsampling1d_1')(d1)
    d3 = Conv1D(256, 3, activation='tanh', padding='valid', name='decod_conv1d_2')(d2)
    d4 = UpSampling1D(2, name='decod_upsampling1d_2')(d3)
    d5 = Flatten(name='decod_flatten')(d4)
    d6 = Dense(501, name='decod_dense1')(d5)
    decoded = Reshape((501,1), name='decod_reshape')(d6)
    return Model(encoder_input, decoded)


def load_weights(model, filepath):
    with h5py.File(filepath, mode='r') as f:
        file_layer_names = [n.decode('utf8') for n in f.attrs['layer_names']]
        model_layer_names = [layer.name for layer in model.layers]

        weight_values_to_load = []
        for name in file_layer_names:
            if name not in model_layer_names:
                print(name, "is ignored; skipping")
                continue
            g = f[name]
            weight_names = [n.decode('utf8') for n in g.attrs['weight_names']]

            weight_values = []
            if len(weight_names) != 0:
                weight_values = [g[weight_name] for weight_name in weight_names]
            try:
                layer = model.get_layer(name=name)
            except:
                layer = None
            if layer is not None:
                symbolic_weights = (layer.trainable_weights + 
                                    layer.non_trainable_weights)
                if len(symbolic_weights) != len(weight_values):
                    print('Model & file weights shapes mismatch')
                else:
                    weight_values_to_load += zip(symbolic_weights, weight_values)

        K.batch_set_value(weight_values_to_load)


X = np.random.randn(10, 501, 1)
reset_seeds()
encoder = Encoder()
AE = DecoderAE(encoder.input, encoder.output)
AE.compile(optimizer='adam', loss='mse')


epochs = 10
batch_size = 100
validation_split = 0.2
# train the model
history = AE.fit(x = training, y = training,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_split=validation_split)


reset_seeds()
encoder = Encoder()
decoder = Decoder()

# Test equality
features = encoder.predict(X)
features = np.squeeze(features) # had to add this otherwise it would complain because of wrong shapes
score = decoder.predict(features) 
predictions = AE.predict(X)
print(np.sum(score - predictions))
# I am actually getting values >> 1


AE.save_weights('autoencoder_weights.h5')
AE_saved_weights = AE.get_weights()

decoder = Decoder()
load_weights(decoder, 'autoencoder_weights.h5')  # see "reference"
decoder_loaded_weights = decoder.get_weights()

AE_decoder_weights = AE_saved_weights[-len(decoder_loaded_weights):]
for w1, w2 in zip(AE_decoder_weights, decoder_loaded_weights):
    print(np.sum(w1 - w2))

代码可以训练AE,但是

The code runs training the AE, however

1)对于scorepredictions

2)代码停止产生

(u'input_1', 'is ignored; skipping')
(u'conv1d', 'is ignored; skipping')
(u'max_pooling1d', 'is ignored; skipping')
(u'conv1d_1', 'is ignored; skipping')
(u'max_pooling1d_1', 'is ignored; skipping')
(u'flatten', 'is ignored; skipping')
(u'dense', 'is ignored; skipping')
(u'reshape', 'is ignored; skipping')
(u'reshape_1', 'is ignored; skipping')
Traceback (most recent call last):
  File "Autoenc.py", line 256, in <module>
    load_weights(decoder, 'autoencoder_weights.h5')  # see "reference"
  File "Autoenc.py", line 219, in load_weights
    K.batch_set_value(weight_values_to_load)
  File "/home/user/anaconda3/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2725, in batch_set_value
    assign_placeholder = tf.placeholder(tf_dtype,
AttributeError: 'module' object has no attribute 'placeholder'

2编辑

这是@OverLordGoldDragon的最后评论之后的新文件. 我收到以下错误消息.

Here is my new file following last comment by @OverLordGoldDragon . I get the error posted below.

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")


def Encoder():
    encoder_input = Input(batch_shape=(None, 501, 1))  
    e1 = Conv1D(256,3, activation='tanh', padding='valid')(encoder_input)
    e2 = MaxPooling1D(2)(e1)
    e3 = Conv1D(32,3, activation='tanh', padding='valid')(e2)
    e4 = MaxPooling1D(2)(e3)
    e5 = Flatten()(e4)
    encoded = Dense(32,activation = 'tanh')(e5)
    encoded = Reshape((32,1))(encoded)
    return Model(encoder_input, encoded)


def Decoder():
    encoded_input = Input(shape=(32,))  
    encoded_reshaped = Reshape((32,1))(encoded_input)
    d1 = Conv1D(32, 3, activation='tanh', padding='valid', name='decod_conv1d_1')(encoded_reshaped)
    d2 = UpSampling1D(2, name='decod_upsampling1d_1')(d1)
    d3 = Conv1D(256, 3, activation='tanh', padding='valid', name='decod_conv1d_2')(d2)
    d4 = UpSampling1D(2, name='decod_upsampling1d_2')(d3)
    d5 = Flatten(name='decod_flatten')(d4)
    d6 = Dense(501, name='decod_dense1')(d5)
    decoded = Reshape((501,1), name='decod_reshape')(d6)
    return Model(encoded_input, decoded)


def DecoderAE(encoder_input, encoded_input):
    encoded_reshaped = Reshape((32,1))(encoded_input)
    d1 = Conv1D(32, 3, activation='tanh', padding='valid',
                       name='decod_conv1d_1')(encoded_reshaped)
    d2 = UpSampling1D(2, name='decod_upsampling1d_1')(d1)
    d3 = Conv1D(256, 3, activation='tanh', padding='valid', name='decod_conv1d_2')(d2)
    d4 = UpSampling1D(2, name='decod_upsampling1d_2')(d3)
    d5 = Flatten(name='decod_flatten')(d4)
    d6 = Dense(501, name='decod_dense1')(d5)
    decoded = Reshape((501,1), name='decod_reshape')(d6)
    return Model(encoder_input, decoded)


def load_weights(model, filepath):
    with h5py.File(filepath, mode='r') as f:
        file_layer_names = [n.decode('utf8') for n in f.attrs['layer_names']]
        model_layer_names = [layer.name for layer in model.layers]

        weight_values_to_load = []
        for name in file_layer_names:
            if name not in model_layer_names:
                print(name, "is ignored; skipping")
                continue
            g = f[name]
            weight_names = [n.decode('utf8') for n in g.attrs['weight_names']]

            weight_values = []
            if len(weight_names) != 0:
                weight_values = [g[weight_name] for weight_name in weight_names]
            try:
                layer = model.get_layer(name=name)
            except:
                layer = None
            if layer is not None:
                symbolic_weights = (layer.trainable_weights + 
                                    layer.non_trainable_weights)
                if len(symbolic_weights) != len(weight_values):
                    print('Model & file weights shapes mismatch')
                else:
                    weight_values_to_load += zip(symbolic_weights, weight_values)

        K.batch_set_value(weight_values_to_load)


X = np.random.randn(10, 501, 1)
reset_seeds()
encoder = Encoder()
AE = DecoderAE(encoder.input, encoder.output)
AE.compile(optimizer='adam', loss='mse')


epochs = 2
batch_size = 100
validation_split = 0.2
# train the model
history = AE.fit(x = training, y = training,
                    epochs=epochs,
                    batch_size=batch_size,
                    validation_split=validation_split)


reset_seeds()
encoder = Encoder()
decoder = Decoder()
decoder.save_weights('decoder_weights.h5')



AE.save_weights('autoencoder_weights.h5')
AE_saved_weights = AE.get_weights()

decoder = Decoder()
load_weights(decoder, 'autoencoder_weights.h5')  # see "reference"
decoder_loaded_weights = decoder.get_weights()

# Test equality
features = encoder.predict(X)
features = np.squeeze(features) 
score = decoder.predict(features) 
predictions = AE.predict(X)
print(np.sum(score - predictions))


AE_decoder_weights = AE_saved_weights[-len(decoder_loaded_weights):]
for w1, w2 in zip(AE_decoder_weights, decoder_loaded_weights):
    print(np.sum(w1 - w2))



Traceback (most recent call last):
  File "Autoenc_pazzo.py", line 251, in <module>
    decoder_loaded_weights = decoder.get_weights()
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/engine/training.py", line 153, in get_weights
    return super(Model, self).get_weights()
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1130, in get_weights
    return backend.batch_get_value(params)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/backend.py", line 3010, in batch_get_value
    return get_session(tensors).run(tensors)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 950, in run
    run_metadata_ptr)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1173, in _run
    feed_dict_tensor, options, run_metadata)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1350, in _do_run
    run_metadata)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1370, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.FailedPreconditionError: Error while reading resource variable decod_conv1d_1_2/bias from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/decod_conv1d_1_2/bias/N10tensorflow3VarE does not exist.
     [[node decod_conv1d_1_2/bias/Read/ReadVariableOp (defined at Autoenc_pazzo.py:168) ]]

Original stack trace for u'decod_conv1d_1_2/bias/Read/ReadVariableOp':
  File "Autoenc_pazzo.py", line 249, in <module>
    decoder = Decoder()
  File "Autoenc_pazzo.py", line 168, in Decoder
    d1 = Conv1D(32, 3, activation='tanh', padding='valid', name='decod_conv1d_1')(encoded_reshaped)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 591, in __call__
    self._maybe_build(inputs)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1881, in _maybe_build
    self.build(input_shapes)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/layers/convolutional.py", line 174, in build
    dtype=self.dtype)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/engine/base_layer.py", line 384, in add_weight
    aggregation=aggregation)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/training/tracking/base.py", line 663, in _add_variable_with_custom_getter
    **kwargs_for_getter)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/keras/engine/base_layer_utils.py", line 155, in make_variable
    shape=variable_shape if variable_shape.rank else None)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 259, in __call__
    return cls._variable_v1_call(*args, **kwargs)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 220, in _variable_v1_call
    shape=shape)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 198, in <lambda>
    previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 2495, in default_variable_creator
    shape=shape)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 263, in __call__
    return super(VariableMetaclass, cls).__call__(*args, **kwargs)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 460, in __init__
    shape=shape)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 649, in _init_from_args
    value = self._read_variable_op()
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/resource_variable_ops.py", line 935, in _read_variable_op
    self._dtype)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/ops/gen_resource_variable_ops.py", line 587, in read_variable_op
    "ReadVariableOp", resource=resource, dtype=dtype, name=name)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3616, in create_op
    op_def=op_def)
  File "/home/alessio/anaconda3/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2005, in __init__
    self._traceback = tf_stack.extract_stack()

推荐答案

您的代码的主要问题:缺少使用随机种子.我指的是此答案的完整说明,并在此处关注您的特定实例

Main problem with your code: lack of use of a random seed. I refer to full explanation to this answer, and focus on your particular instance here.

说明:

  • 模型实例化的顺序很重要,因为它会更改初始化的权重
  • 在使用中,如果您训练多个模型,则应在训练之前在模型实例化重置种子.
  • 您的AE,编码器和解码器定义多余地使用了Input,并使自省(例如.summary())复杂化; Encoder()Decoder()已经照顾好了
  • 要检查加载的解码器权重是否与保存的经过训练的AE解码器权重匹配,请参见下面的示例
  • Order of model instantiation matters, as it changes the initialized weights
  • In usage, if you train multiple models, you should reset seed both at model instantiation and before training
  • Your AE, Encoder, and Decoder definitions make a redundant use of Input, and complicate introspection (e.g. .summary()); Encoder() and Decoder() already take care of it
  • To check whether loaded decoder weights match saved trained AE decoder weights, see example below

解决方案:

reset_seeds()
X = np.random.randn(10, 501, 1)  # '10' arbitrary
encoder_input = Input(batch_shape=(None, 501, 1))

reset_seeds()
encoder = Encoder()
decoder = Decoder()
autoencoder = Model(x, decoder(encoder(x)))
autoencoder.compile(optimizer='adam', loss='mse')

reset_seeds()
encoder = Encoder()
decoder = Decoder()

predictions = autoencoder.predict(X)

features = encoder.predict(X)
score = decoder.predict(features)

print(np.sum(score - predictions))
# 0.0  <-- 100% agreement


保存/加载示例 +首选的AE定义; 参考


Save/load example + preferred AE definition; reference

您的AE定义通过例如.summary();而是定义如下.

Your AE definition limits introspection via e.g. .summary(); instead, define as below.

X = np.random.randn(10, 501, 1)
reset_seeds()
encoder = Encoder()
AE = DecoderAE(encoder.input, encoder.output)
AE.compile(optimizer='adam', loss='mse')

reset_seeds()
encoder = Encoder()
decoder = Decoder()

# Test equality
features = encoder.predict(X)
score = decoder.predict(features) 
predictions = AE.predict(X)
print(np.sum(score - predictions))
# 0.0  <-- exact or close to

AE.save_weights('autoencoder_weights.h5')
AE_saved_weights = AE.get_weights()

decoder = Decoder()
load_weights(decoder, 'autoencoder_weights.h5')  # see "reference"
decoder_loaded_weights = decoder.get_weights()

AE_decoder_weights = AE_saved_weights[-len(decoder_loaded_weights):]
for w1, w2 in zip(AE_decoder_weights, decoder_loaded_weights):
    print(np.sum(w1 - w2))
# 0.0
# 0.0
# ...


使用的功能:

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    if tf.__version__[0] == '2':
        tf.random.set_seed(3)
    else:
        tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")

def DecoderAE(encoder_input, encoded_input):
    encoded_reshaped = Reshape((32,1))(encoded_input)
    d1 = Conv1D(32, 3, activation='tanh', padding='valid',
                       name='decod_conv1d_1')(encoded_reshaped)
    d2 = UpSampling1D(2, name='decod_upsampling1d_1')(d1)
    d3 = Conv1D(256, 3, activation='tanh', padding='valid', name='decod_conv1d_2')(d2)
    d4 = UpSampling1D(2, name='decod_upsampling1d_2')(d3)
    d5 = Flatten(name='decod_flatten')(d4)
    d6 = Dense(501, name='decod_dense1')(d5)
    decoded = Reshape((501,1), name='decod_reshape')(d6)
    return Model(encoder_input, decoded)

这篇关于为什么自动编码器与编码器+解码器的预测有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆