神经网络关于输入的导数 [英] Derivative of neural network with respect to input

查看:44
本文介绍了神经网络关于输入的导数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我训练了一个神经网络来对正弦函数进行回归,并想计算关于输入的一阶和二阶导数.我尝试使用这样的 tf.gradients() 函数(neural_net 是 tf.keras.Sequential 的一个实例):

prediction = neural_net(x_value)dx_f = tf.gradients(预测,x_value)dx_dx_f = tf.gradients(dx_f, x_value)

x_value 是一个具有测试大小长度的数组.但是,这会导致

I trained a neural network to do a regression on the sine function and would like to compute the first and second derivative with respect to the input. I tried using the tf.gradients() function like this (neural_net is an instance of tf.keras.Sequential):

prediction = neural_net(x_value)
dx_f = tf.gradients(prediction, x_value)
dx_dx_f = tf.gradients(dx_f, x_value)

x_value is an array that has the length of the test size. However, this results in predictions and derivatives. The prediction of the network (blue curve) basically exactly catches the sine function, but I had to divide the first derivative (orange) with a factor of 10 and the second derivative (green) with a factor of 100 in order for it to be in the same order of magnitude. So the, the first derivative looks (after that rescale) ok, but the seond derivative is completely erratic. Since the prediction of the sine function works really well there is clearly something funny going on here.

解决方案

One possible explanation for what you observed, could be that your function is not derivable two times. It looks as if there are jumps in the 1st derivative around the extrema. If so, the 2nd derivative of the function doesn't really exist and the plot you get higly depends on how the library handles such places.

Consider the following picture of a non-smooth function, that jumps from 0.5 to -0.5 for all x in {1, 2, ....}. It's slope is 1 in all places except when x is an integer. If you'd try to plot it's derivative, you would probably see a straight line at y=1, which can be easily misinterpreted because if someone just looks at this plot, they could think the function is completely linear and starts from -infinity to +infinity.

If your results are produced by a neural net which uses RELU, you can try to do the same with the sigmoid activation function. I suppose you won't see that many spikes with this function.

这篇关于神经网络关于输入的导数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆