如何使用自动编码器#2部分-深度自动编码器#3rd部分-堆叠式自动编码器初始化MLP的权重 [英] How to initialise weights of a MLP using an autoencoder #2nd part - Deep autoencoder #3rd part - Stacked autoencoder

查看:117
本文介绍了如何使用自动编码器#2部分-深度自动编码器#3rd部分-堆叠式自动编码器初始化MLP的权重的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我建立了一个自动编码器(1个编码器8:5,1个解码器5:8),该编码器采用了Pima-Indian-Diabetes数据集( https://github.com/keras-team/keras/Issues/91 https: //www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g .这里的问题是我应该考虑哪个权重矩阵?编码器部分还是解码器部分?当我为mlp添加图层时,如何使用这些保存的权重来初始化权重,而无法获得确切的语法.另外,由于我缩小的维数是5,所以我的mlp应该从5个神经元开始吗?对于此二进制分类问题,mlp的可能尺寸是多少?如果有人可以详细说明吗?

深层的自动编码器代码如下:

# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy

# Data pre-processing...

# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]

# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
                                X, Y, test_size=0.2, random_state=42)

# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)

# Autoencoder code begins here...
encoding_dim1 = 5    # size of encoded representations
encoding_dim2 = 3    # size of encoded representations in the bottleneck layer

# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the first encoded representation of the input
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
# "enc" is the second encoded representation of the input
enc = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
# "dec" is the lossy reconstruction of the input
dec = Dense(encoding_dim1, activation='sigmoid', name='decoder1')(enc)
# "decoded" is the final lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder2')(dec)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)

autoencoder.compile(optimizer='sgd', loss='mse')

# training
autoencoder.fit(x_train, x_train,
            epochs=300,
            batch_size=10,
            shuffle=True,
            validation_data=(x_test, x_test))  # need more tuning

# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)

#The stacked autoencoder code is as follows:

# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy

# Data pre-processing...

# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]

# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
                                X, Y, test_size=0.2, random_state=42)

# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)

# Autoencoder code goes here...
encoding_dim1 = 5    # size of encoded representations
encoding_dim2 = 3    # size of encoded representations in the bottleneck layer

# this is our input placeholder
input_data1 = Input(shape=(8,))
# the first encoded representation of the input
encoded1 = Dense(encoding_dim1, activation='relu',
             name='encoder1')(input_data1)
# the first lossy reconstruction of the input
decoded1 = Dense(8, activation='sigmoid', name='decoder1')(encoded1)
# this model maps an input to its first layer of reconstructions
autoencoder1 = Model(inputs=input_data1, outputs=decoded1)
# this is the first encoder model
enc1 = Model(inputs=input_data1, outputs=encoded1)

autoencoder1.compile(optimizer='sgd', loss='mse')

# training
autoencoder1.fit(x_train, x_train, epochs=300,
             batch_size=10, shuffle=True,
             validation_data=(x_test, x_test))
FirstAEoutput = autoencoder1.predict(x_train)

input_data2 = Input(shape=(encoding_dim1,))
# the second encoded representations of the input
encoded2 = Dense(encoding_dim2, activation='relu',
             name='encoder2')(input_data2)
# the final lossy reconstruction of the input
decoded2 = Dense(encoding_dim1, activation='sigmoid',
             name='decoder2')(encoded2)

# this model maps an input to its second layer of reconstructions
autoencoder2 = Model(inputs=input_data2, outputs=decoded2)

# this is the second encoder
enc2 = Model(inputs=input_data2, outputs=encoded2)

autoencoder2.compile(optimizer='sgd', loss='mse')

# training
autoencoder2.fit(FirstAEoutput, FirstAEoutput, epochs=300,
             batch_size=10, shuffle=True)

# this is the overall autoencoder mapping an input to its final reconstructions
autoencoder = Model(inputs=input_data1, outputs=encoded2)
# test the autoencoder by encoding and decoding the test dataset

reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)

解决方案

这么多问题.你都尝试了些什么?代码段?

如果您的解码器正在尝试重建输入,那么将分类器附加到其输出对我来说真的没有任何意义.我的意思是,为什么不只在第一时间将其附加到输入中?因此,如果您打算使用自动编码器,那么很显然,您应该将分类器附加到编码器管道的输出上.

我不太确定使用自动编码器的权重并将其输入到mlp中"是什么意思.您无需使用另一层的权重来填充一层,而是使用它的输出信号.这在Keras上很容易做到.假设您定义了自动编码器并对其进行了如下训练:

 from keras Input, Model
from keras import backend as K
from keras.layers import Dense

x = Input(shape=[8])
y = Dense(5, activation='sigmoid' name='encoder')(x)
y = Dense(8, name='decoder')(y)

ae = Model(inputs=x, outputs=y)
ae.compile(loss='mse', ...)
ae.fit(x_train, x_train, ...)

K.models.save_model(ae, './autoencoder.h5')
 

然后,您可以在编码器上附加一个分类层,并使用以下代码创建一个分类器模型:

 # load the model from the disk if you
# are in a different execution.
ae = K.models.load_model('./autoencoder.h5')

y = ae.get_layer('encoder').output
y = Dense(1, activation='sigmoid', name='predictions')(y)

classifier = Model(inputs=ae.inputs, outputs=y)
classifier.compile(loss='binary_crossentropy', ...)
classifier.fit(x_train, y_train, ...)
 

就是这样.现在,classifier模型将具有ae模型的第一嵌入层 encoder 作为其第一层,然后是sigmoid决策层 predictions .


如果您真正想做的是使用自动编码器学到的权重来初始化来自分类器的权重(我不太肯定,我推荐这种方法):

您可以使用layer#get_weights提取权重矩阵,将其修剪(因为编码器有5个单位,而分类器只有1个单位),最后设置分类器权重.以下几行内容:

 w, b = ae.get_layer('encoder').get_weights()

# remove all units except by one.
neuron_to_keep = 2
w = w[:, neuron_to_keep:neuron_to_keep + 1]
b = b[neuron_to_keep:neuron_to_keep + 1]

classifier.get_layer('predictions').set_weights(w, b)
 

I have built an autoencoder (1 encoder 8:5, 1 decoder 5:8) which takes the Pima-Indian-Diabetes dataset (https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv) and reduces its dimension (from 8 to 5). I would now like to use these reduced features to classify the data using an mlp. Now, here, I have some problems with the basic understanding of the architecture. How do I use the weights of the autoencoder and feed them into the mlp? I have checked these threads - https://github.com/keras-team/keras/issues/91 and https://www.codementor.io/nitinsurya/how-to-re-initialize-keras-model-weights-et41zre2g. The question here is which weight matrix should I consider? the one for the encoder part or the decoder part? When I add the layers for the mlp, how do I initialise the weights with these saved weights, not getting the exact syntax. Also, should my mlp start with 5 neurons since my reduced dimension is 5? What are the possible dimensions of the mlp for this binary classification problem? If anyone could elaborate please?

The deep autoencoder code is as follows:

# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy

# Data pre-processing...

# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]

# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
                                X, Y, test_size=0.2, random_state=42)

# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)

# Autoencoder code begins here...
encoding_dim1 = 5    # size of encoded representations
encoding_dim2 = 3    # size of encoded representations in the bottleneck layer

# this is our input placeholder
input_data = Input(shape=(8,))
# "encoded" is the first encoded representation of the input
encoded = Dense(encoding_dim1, activation='relu', name='encoder1')(input_data)
# "enc" is the second encoded representation of the input
enc = Dense(encoding_dim2, activation='relu', name='encoder2')(encoded)
# "dec" is the lossy reconstruction of the input
dec = Dense(encoding_dim1, activation='sigmoid', name='decoder1')(enc)
# "decoded" is the final lossy reconstruction of the input
decoded = Dense(8, activation='sigmoid', name='decoder2')(dec)
# this model maps an input to its reconstruction
autoencoder = Model(inputs=input_data, outputs=decoded)

autoencoder.compile(optimizer='sgd', loss='mse')

# training
autoencoder.fit(x_train, x_train,
            epochs=300,
            batch_size=10,
            shuffle=True,
            validation_data=(x_test, x_test))  # need more tuning

# test the autoencoder by encoding and decoding the test dataset
reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)

#The stacked autoencoder code is as follows:

# from keras.models import Sequential
from keras.layers import Input, Dense
from keras.models import Model
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
import numpy

# Data pre-processing...

# load pima indians dataset
dataset = numpy.loadtxt("C:/Users/dibsa/Python Codes/pima.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:8]
Y = dataset[:, 8]

# Split data into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(
                                X, Y, test_size=0.2, random_state=42)

# scale the data within [0-1] range
scalar = MinMaxScaler()
x_train = scalar.fit_transform(x_train)
x_test = scalar.fit_transform(x_test)

# Autoencoder code goes here...
encoding_dim1 = 5    # size of encoded representations
encoding_dim2 = 3    # size of encoded representations in the bottleneck layer

# this is our input placeholder
input_data1 = Input(shape=(8,))
# the first encoded representation of the input
encoded1 = Dense(encoding_dim1, activation='relu',
             name='encoder1')(input_data1)
# the first lossy reconstruction of the input
decoded1 = Dense(8, activation='sigmoid', name='decoder1')(encoded1)
# this model maps an input to its first layer of reconstructions
autoencoder1 = Model(inputs=input_data1, outputs=decoded1)
# this is the first encoder model
enc1 = Model(inputs=input_data1, outputs=encoded1)

autoencoder1.compile(optimizer='sgd', loss='mse')

# training
autoencoder1.fit(x_train, x_train, epochs=300,
             batch_size=10, shuffle=True,
             validation_data=(x_test, x_test))
FirstAEoutput = autoencoder1.predict(x_train)

input_data2 = Input(shape=(encoding_dim1,))
# the second encoded representations of the input
encoded2 = Dense(encoding_dim2, activation='relu',
             name='encoder2')(input_data2)
# the final lossy reconstruction of the input
decoded2 = Dense(encoding_dim1, activation='sigmoid',
             name='decoder2')(encoded2)

# this model maps an input to its second layer of reconstructions
autoencoder2 = Model(inputs=input_data2, outputs=decoded2)

# this is the second encoder
enc2 = Model(inputs=input_data2, outputs=encoded2)

autoencoder2.compile(optimizer='sgd', loss='mse')

# training
autoencoder2.fit(FirstAEoutput, FirstAEoutput, epochs=300,
             batch_size=10, shuffle=True)

# this is the overall autoencoder mapping an input to its final reconstructions
autoencoder = Model(inputs=input_data1, outputs=encoded2)
# test the autoencoder by encoding and decoding the test dataset

reconstructions = autoencoder.predict(x_test)
print('Original test data')
print(x_test)
print('Reconstructed test data')
print(reconstructions)

解决方案

So many questions. What have you tried so far? Code snippets?

If your decoder is trying to reconstruct the input, then it doesn't really make sense to me to attach your classifier to its output. I mean, why not just attach it to the input in the first time? So if you are set on using an auto-encoder, I'd say it's pretty clear that you should attach your classifier to the output of the encoder pipe.

I'm not quite sure what you mean with "use the weights of the autoencoder and feed them into the mlp". You don't feed a layer with another layer's weights, but with it's output signal. This is pretty easy to do on Keras. Let's say you defined your auto-encoder and trained it as such:

from keras Input, Model
from keras import backend as K
from keras.layers import Dense

x = Input(shape=[8])
y = Dense(5, activation='sigmoid' name='encoder')(x)
y = Dense(8, name='decoder')(y)

ae = Model(inputs=x, outputs=y)
ae.compile(loss='mse', ...)
ae.fit(x_train, x_train, ...)

K.models.save_model(ae, './autoencoder.h5')

Then you can attach a classifying layer at the encoder and create a classifier model with the following code:

# load the model from the disk if you
# are in a different execution.
ae = K.models.load_model('./autoencoder.h5')

y = ae.get_layer('encoder').output
y = Dense(1, activation='sigmoid', name='predictions')(y)

classifier = Model(inputs=ae.inputs, outputs=y)
classifier.compile(loss='binary_crossentropy', ...)
classifier.fit(x_train, y_train, ...)

That's it, really. The classifier model will now have the first embedding layer encoder of the ae model as its first layer, followed by a sigmoid decision layer predictions.


If what you are really trying to do is to use the weights learned by the auto-encoder to initialize the weights from the classifier (I'm not positive I recommend this approach):

You can take the weight matrices with layer#get_weights, prune it (because the encoder has 5 units and the classifier only has 1) and finally set the classifier weights. Something in the following lines:

w, b = ae.get_layer('encoder').get_weights()

# remove all units except by one.
neuron_to_keep = 2
w = w[:, neuron_to_keep:neuron_to_keep + 1]
b = b[neuron_to_keep:neuron_to_keep + 1]

classifier.get_layer('predictions').set_weights(w, b)

这篇关于如何使用自动编码器#2部分-深度自动编码器#3rd部分-堆叠式自动编码器初始化MLP的权重的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆