keras/tensorflow模型:梯度w.r.t.输入对所有输入数据返回相同(错误?)值 [英] keras/tensorflow model: gradient w.r.t. input return the same (wrong?) value for all input data

查看:396
本文介绍了keras/tensorflow模型:梯度w.r.t.输入对所有输入数据返回相同(错误?)值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出一个训练有素的keras模型,我试图计算输出相对于输入的梯度.

Given a trained keras model I am trying to compute the gradient of the output with respect to the input.

此示例尝试将函数y=x^2与由4层relu激活组成的keras模型拟合,并计算模型输出相对于输入的梯度.

This example tries to fit the function y=x^2 with a keras model composed by 4 layers of relu activations, and compute the gradient of the model output with respect to the input.

from keras.models import Sequential
from keras.layers import Dense
from keras import backend as k
from sklearn.model_selection import train_test_split
import numpy as np
import tensorflow as tf

# random data
x = np.random.random((1000, 1))
y = x**2

# split train/val
x_train, x_val, y_train, y_val = train_test_split(x, y, test_size=0.15)

# model
model = Sequential()
# 1d input
model.add(Dense(10, input_shape=(1, ), activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(10, activation='relu'))
# 1d output
model.add(Dense(1))

## compile and fit
model.compile(loss='mse', optimizer='rmsprop', metrics=['mae'])
model.fit(x_train, y_train, batch_size=256, epochs=100, validation_data=(x_val, y_val), shuffle=True)

## compute derivative (gradient)
session = tf.Session()
session.run(tf.global_variables_initializer())
y_val_d_evaluated = session.run(tf.gradients(model.output, model.input), feed_dict={model.input: x_val})

print(y_val_d_evaluated)

x_val01之间的150个随机数的向量.

x_val is a vector of 150 random number between 0 and 1.

我的期望是y_val_d_evaluated(渐变)应为:

My expectations is that y_val_d_evaluated (the gradient) should be:

A. array包含150个不同的数字(因为x_val包含150个不同的数字);

A. an array of 150 different numbers (because x_val contains 150 different numbers);

B.值应接近2*x_val(x^2的导数).

B. the values should be near to 2*x_val (the derivative of x^2).

相反,每次我运行此示例时,y_val_d_evaluated包含150个相等的值(例如[0.0150494][-0.0150494][0.0150494][-0.0150494],...),而且该值与2x,并且每次运行示例时值都会更改.

Instead, every time I run this example, y_val_d_evaluated contains 150 equal values (e.g. [0.0150494], [-0.0150494], [0.0150494], [-0.0150494], ...), moreover the value is very different from 2x, and the value change every time I run the example.

任何人都有一些建议可以帮助我理解为什么这段代码没有给出预期的渐变结果?

Anyone has some suggestions to help me to understand why this code does not give the expected gradient results?

推荐答案

好,我发现了问题所在,以下几行:

Ok I found the problem, the following lines:

session = tf.Session()
session.run(tf.global_variables_initializer())

创建一个新的tf会话,该会话将覆盖模型参数,因此在执行了这些指令之后,该模型就是具有随机初始参数的模型.这就解释了为什么每次运行的值都不同.

create a new tf session that overwrites the model parameters, so after these instructions the model was a model with the random initial paramters. This explains why every run the value was different.

从keras环境中获取tensorflow会话的解决方案是使用:

The solution to get the tensorflow session from a keras environment is to use:

session = k.get_session()

通过这种简单的更改,结果如我所愿.

whith this simple change the results go as I expected.

这篇关于keras/tensorflow模型:梯度w.r.t.输入对所有输入数据返回相同(错误?)值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆