如何在Keras中使用“有状态"变量/张量创建自定义图层? [英] How to create a custom layer in Keras with 'stateful' variables/tensors?
问题描述
我想问您一些有关创建我的自定义图层的帮助. 我想做的实际上很简单:生成带有有状态"变量的输出层,即张量,其值在每批中都会更新.
I would like to ask you some help for creating my custom layer. What I am trying to do is actually quite simple: generating an output layer with 'stateful' variables, i.e. tensors whose value is updated at each batch.
为了使所有内容更加清晰,下面是我想做的一小段:
In order to make everything more clear, here is a snippet of what I would like to do:
def call(self, inputs)
c = self.constant
m = self.extra_constant
update = inputs*m + c
X_new = self.X_old + update
outputs = X_new
self.X_old = X_new
return outputs
这里的想法很简单:
-
X_old
在def__ init__(self, ...)
中初始化为0
-
update
是根据图层输入来计算的 - 计算图层的输出(即
X_new
) -
X_old
的值设置为等于X_new
,以便在下一个批次中,X_old
不再等于零,而是等于上一个批次中的X_new
.
X_old
is initialized to 0 in thedef__ init__(self, ...)
update
is computed as a function of the inputs to the layer- the output of the layer is computed (i.e.
X_new
) - the value of
X_old
is set equal toX_new
so that, at the next batch,X_old
is no longer equal to zero but equal toX_new
from the previous batch.
我发现K.update
可以完成工作,如示例所示:
I have found out that K.update
does the job, as shown in the example:
X_new = K.update(self.X_old, self.X_old + update)
这里的问题是,如果我尝试将图层的输出定义为:
The problem here is that, if I then try to define the outputs of the layer as:
outputs = X_new
return outputs
尝试model.fit()时,我将收到以下错误:
I will receiver the following error when I try model.fit():
ValueError: An operation has `None` for gradient. Please make sure that all of your ops have
gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.
即使我强加了layer.trainable = False
,我也仍然遇到此错误,并且我没有为该层定义任何偏差或权重.另一方面,如果我只是执行self.X_old = X_new
,则X_old
的值不会得到更新.
And I keep having this error even though I imposed layer.trainable = False
and I did not define any bias or weights for the layer. On the other hand, if I just do self.X_old = X_new
, the value of X_old
does not get updated.
你们有解决方案吗?我相信这不应该那么难,因为有状态的RNN也具有类似"的功能.
Do you guys have a solution to implement this? I believe it should not be that hard, since also stateful RNN have a 'similar' functioning.
提前感谢您的帮助!
推荐答案
定义自定义图层有时会造成混乱.您重写的某些方法将被调用一次,但给您的印象是,就像许多其他OO库/框架一样,它们将被多次调用.
Defining a custom layer can become confusing some times. Some of the methods that you override are going to be called once but it gives you the impression that just like many other OO libraries/frameworks, they are going to be called many times.
这是我的意思:当您定义一个图层并在模型中使用它时,为覆盖call
方法编写的python代码将不会在向前或向后传递中直接调用.相反,当您调用model.compile
时,它仅被调用一次.它将python代码编译成一个计算图,而张量将在其中流动的图就是训练和预测期间的计算.
Here is what I mean: When you define a layer and use it in a model the python code that you write for overriding call
method is not going to be directly called in forward or backward passes. Instead, it's called only once when you call model.compile
. It compiles the python code to a computational graph and that graph in which the tensors will flow is what does the computations during training and prediction.
这就是为什么如果您想通过放置print
语句来调试模型的原因;您需要使用tf.print
向图形添加打印命令.
That's why if you want to debug your model by putting a print
statement it won't work; you need to use tf.print
to add a print command to the graph.
与您要拥有的状态变量的情况相同.除了简单地将old + update
分配给new
之外,您还需要调用Keras函数,以将该操作添加到图形中.
It is the same situation with the state variable you want to have. Instead of simply assigning old + update
to new
you need to call a Keras function that adds that operation to the graph.
请注意,张量是不可变的,因此您需要在__init__
方法中将状态定义为tf.Variable
.
And note that tensors are immutable so you need to define the state as tf.Variable
in the __init__
method.
所以我相信这段代码更像您要寻找的东西:
So I believe this code is more like what you're looking for:
class CustomLayer(tf.keras.layers.Layer):
def __init__(self, **kwargs):
super(CustomLayer, self).__init__(**kwargs)
self.state = tf.Variable(tf.zeros((3,3), 'float32'))
self.constant = tf.constant([[1,1,1],[1,0,-1],[-1,0,1]], 'float32')
self.extra_constant = tf.constant([[1,1,1],[1,0,-1],[-1,0,1]], 'float32')
self.trainable = False
def call(self, X):
m = self.constant
c = self.extra_constant
outputs = self.state + tf.matmul(X, m) + c
tf.keras.backend.update(self.state, tf.reduce_sum(outputs, axis=0))
return outputs
这篇关于如何在Keras中使用“有状态"变量/张量创建自定义图层?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!