使用自定义渐变的自定义激活不起作用 [英] Custom Activation with custom gradient does not work

查看:32
本文介绍了使用自定义渐变的自定义激活不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为简单的神经网络训练编写代码.目标是定义一个自定义激活函数,而不是让 Keras 为反向传播自动获取它的导数,而是让 Keras 使用我的自定义梯度函数进行自定义激活:

将 numpy 导入为 np将张量流导入为 tf导入数学进口keras从 keras.models 导入模型,顺序从 keras.layers 导入输入、密集、激活从 keras 导入正则化器从 keras 导入后端为 K从 keras.backend 导入 tf从 keras 导入初始化程序从 keras.layers 导入 Lambda@tf.custom_gradientdef custom_activation(x):def grad(dy):返回 dy * 0结果=(K.sigmoid(x) *2-1)返回结果,毕业x_train=np.array([[1,2],[3,4],[3,4]]);输入=输入(形状=(2,))output_1 = Dense(20, kernel_initializer='glorot_normal')(输入)layer = Lambda(lambda x: custom_activation)(output_1)output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(layer)模型2 =模型(输入=输入,输出=输出_2)model2.compile(optimizer='adam',loss='mean_squared_error')model2.fit(x_train,x_train,epochs=20,validation_split=0.1,shuffle=False)

由于梯度被定义为零,我希望损失在所有时期之后都不会改变.这是我得到的错误的回溯:

使用 TensorFlow 后端.警告:tensorflow:来自 C:ProgramDataAnaconda3libsite-packages	ensorflowpythonframeworkop_def_library.py:263:colocate_with(来自 tensorflow.python.framework.ops)已被弃用,将来会被删除版本.更新说明:由放置器自动处理的托管.回溯(最近一次调用最后一次): 中的文件C:/p/CE/mytest.py",第 43 行layer = Lambda(lambda x: custom_activation)(output_1)文件C:ProgramDataAnaconda3libsite-packageskerasenginease_layer.py",第 474 行,在 __call__ 中output_shape = self.compute_output_shape(input_shape)文件C:ProgramDataAnaconda3libsite-packageskeraslayerscore.py",第656行,在compute_output_shape返回 K.int_shape(x)文件C:ProgramDataAnaconda3libsite-packageskerasackend	ensorflow_backend.py",第 593 行,int_shape返回元组(x.get_shape().as_list())AttributeError: 'function' 对象没有属性 'get_shape'

更新:我使用了 Manoj Mohan 的回答,现在代码可以工作了.由于梯度被定义为零,我希望看到不同时期之间的损失不变.但是,它确实改变了.为什么?我错过了什么吗?

示例:

纪元1/202019-10-03 10:31:34.193232: I tensorflow/core/platform/cpu_feature_guard.cc:141] 您的 CPU 支持该 TensorFlow 二进制文件未编译使用的指令:AVX22/2 [==============================] - 0s 68ms/步 - 损失:8.3184 - val_loss:13.7232时代 2/202/2 [==============================] - 0s 496us/步 - 损失:8.2783 - val_loss:13.6368

解决方案

替换

layer = Lambda(lambda x: custom_activation)(output_1)

layer = Lambda(custom_activation)(output_1)

<块引用>

我希望看到不同时期之间的损失不变,因为梯度是定义为零.但是,它确实改变了.为什么?

梯度在中间层更新为零.因此,梯度不会从那里向后流动.但是从输出到中间层,梯度会流动,权重也会更新.这种修改后的架构将在不同时期输出恒定的损失.

inputs = Input(shape=(2,))output_1 = Dense(20, kernel_initializer='glorot_normal')(输入)output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(output_1)layer = Lambda(custom_activation)(output_2) #应该是最后一层模型2 =模型(输入=输入,输出=层)

I am trying to write a code for a simple neural network training. The goal is to define a custom activation function and instead of letting Keras take the derivative of it automatically for the backpropagation, I make Keras use my custom gradient function for my custom activation:

import numpy as np
import tensorflow as tf
import math
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Activation
from keras import regularizers
from keras import backend as K
from keras.backend import tf
from keras import initializers
from keras.layers import Lambda

@tf.custom_gradient
def custom_activation(x):

    def grad(dy):
        return dy * 0

    result=(K.sigmoid(x) *2-1 )
    return result, grad 

x_train=np.array([[1,2],[3,4],[3,4]]);

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
layer = Lambda(lambda x: custom_activation)(output_1)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(layer)
model2 = Model(inputs=inputs, outputs=output_2)

model2.compile(optimizer='adam',loss='mean_squared_error')
model2.fit(x_train,x_train,epochs=20,validation_split=0.1,shuffle=False)

Since the gradient has been defined to be zero, I expect that the loss does not change after all epochs. Here is the backtrace of the error I get:

Using TensorFlow backend.
WARNING:tensorflow:From C:ProgramDataAnaconda3libsite-packages	ensorflowpythonframeworkop_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Traceback (most recent call last):
  File "C:/p/CE/mytest.py", line 43, in <module>
    layer = Lambda(lambda x: custom_activation)(output_1)
  File "C:ProgramDataAnaconda3libsite-packageskerasenginease_layer.py", line 474, in __call__
    output_shape = self.compute_output_shape(input_shape)
  File "C:ProgramDataAnaconda3libsite-packageskeraslayerscore.py", line 656, in compute_output_shape
    return K.int_shape(x)
  File "C:ProgramDataAnaconda3libsite-packageskerasackend	ensorflow_backend.py", line 593, in int_shape
    return tuple(x.get_shape().as_list())
AttributeError: 'function' object has no attribute 'get_shape'

Update: I used Manoj Mohan's answer and now the code works. I expect to see unchanged loss among epochs since the gradient is defined to be zero. But, it does change. Why? Am I missing anything?

Example:

Epoch 1/20
2019-10-03 10:31:34.193232: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

2/2 [==============================] - 0s 68ms/step - loss: 8.3184 - val_loss: 13.7232
Epoch 2/20

2/2 [==============================] - 0s 496us/step - loss: 8.2783 - val_loss: 13.6368

解决方案

Replace

layer = Lambda(lambda x: custom_activation)(output_1)

with

layer = Lambda(custom_activation)(output_1)

I expect to see unchanged loss among epochs since the gradient is defined to be zero. But, it does change. Why?

The gradient was updated to zero in an intermediate layer. So, the gradients will not flow backwards from there. But from the output till the intermediate layer, gradient will flow and weights will get updated. This modified architecture, will output constant loss across epochs.

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(output_1)
layer = Lambda(custom_activation)(output_2)  #should be last layer
model2 = Model(inputs=inputs, outputs=layer) 

这篇关于使用自定义渐变的自定义激活不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆