在Keras中实现Rprop算法 [英] Implementing the Rprop algorithm in Keras

查看:87
本文介绍了在Keras中实现Rprop算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为Keras实现弹性反向传播优化器(

I am trying to implement the resilient backpropagation optimizer for Keras (link), but the challenging part was being able to perform an update on each individual parameter based on whether its corresponding gradient is positive, negative or zero. I wrote the code below as a start towards implementing the Rprop optimizer. However, I can't seem to find a way to access the parameters individually. Looping over params (as in the code below) returns p, g, g_old, s, wChangeOld at each iteration which are all matrices.

有没有一种方法可以迭代各个参数并更新它们?如果我可以根据其梯度的符号对参数向量进行索引,那也将起作用.

Is there a way where I could iterate over the individual parameters and update them ? It would also work if I could index the parameter vector based on the sign of its gradients.

class Rprop(Optimizer):
    def __init__(self, init_step=0.01, **kwargs):
        super(Rprop, self).__init__(**kwargs)
        self.init_step = K.variable(init_step, name='init_step')
        self.iterations = K.variable(0., name='iterations')

        self.posStep = 1.2
        self.negStep = 0.5
        self.minStep = 1e-6
        self.maxStep = 50.

    def get_updates(self, params, constraints, loss):
        grads = self.get_gradients(loss, params)
        self.updates = [K.update_add(self.iterations, 1)]

        shapes = [K.get_variable_shape(p) for p in params]
        stepList = [K.ones(shape)*self.init_step  for shape in shapes]
        wChangeOldList = [K.zeros(shape) for shape in shapes]
        grads_old = [K.zeros(shape) for shape in shapes]

        self.weights = stepList + grads_old + wChangeOldList
        self.updates = []

        for p, g, g_old, s, wChangeOld in zip(params, grads, grads_old, 
                                                                  stepList, wChangeOldList):
            change = K.sign(g * g_old)

            if change > 0:
                s_new = K.minimum(s * self.posStep, self.maxStep)
                wChange = s_new * K.sign(g)
                g_new = g

            elif change < 0:
                s_new = K.maximum(s * self.posStep, self.maxStep)
                wChange = - wChangeOld
                g_new = 0

            else:
                s_new = s
                wChange = s_new * K.sign(g)
                g_new = p

            self.updates.append(K.update(g_old, g_new))
            self.updates.append(K.update(wChangeOld, wChange))
            self.updates.append(K.update(s, s_new))

            new_p = p - wChange

            # Apply constraints
            if p in constraints:
                c = constraints[p]
                new_p = c(new_p)

            self.updates.append(K.update(p, new_p))
        return self.updates

    def get_config(self):
        config = {'init_step': float(K.get_value(self.init_step))}
        base_config = super(Rprop, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

推荐答案

我也在Keras中寻找一种RProp算法,并发现了这个问题.我随意调整您的代码以达到我的目的,然后将其重新发布到此处.到目前为止,它似乎工作得很好,但是我没有对其进行广泛的测试.

I was looking for an RProp algorithm in Keras as well and found this question. I took the liberty of adapting your code to my purpose and post it back here now. So far it seems to work quite well, but I didn't test it extensively.

免责声明:我对keras并不陌生,但是对theano(和块)有很多经验.此外,我仅使用theano作为后端对此进行了测试,而未对tensorflow进行过测试.

Disclaimer: I'm very new to keras but have a lot of experience with theano (and blocks). Further I tested this only with theano as a backend, but not tensorflow.

class RProp(Optimizer):
    def __init__(self, init_alpha=1e-3, scale_up=1.2, scale_down=0.5, min_alpha=1e-6, max_alpha=50., **kwargs):
        super(RProp, self).__init__(**kwargs)
        self.init_alpha = K.variable(init_alpha, name='init_alpha')
        self.scale_up = K.variable(scale_up, name='scale_up')
        self.scale_down = K.variable(scale_down, name='scale_down')
        self.min_alpha = K.variable(min_alpha, name='min_alpha')
        self.max_alpha = K.variable(max_alpha, name='max_alpha')

    def get_updates(self, params, constraints, loss):
        grads = self.get_gradients(loss, params)
        shapes = [K.get_variable_shape(p) for p in params]
        alphas = [K.variable(numpy.ones(shape) * self.init_alpha) for shape in shapes]
        old_grads = [K.zeros(shape) for shape in shapes]
        self.weights = alphas + old_grads
        self.updates = []

        for param, grad, old_grad, alpha in zip(params, grads, old_grads, alphas):
            new_alpha = K.switch(
                K.greater(grad * old_grad, 0),
                K.minimum(alpha * self.scale_up, self.max_alpha),
                K.maximum(alpha * self.scale_down, self.min_alpha)
            )
            new_param = param - K.sign(grad) * new_alpha
            # Apply constraints
            if param in constraints:
                c = constraints[param]
                new_param = c(new_param)
            self.updates.append(K.update(param, new_param))
            self.updates.append(K.update(alpha, new_alpha))
            self.updates.append(K.update(old_grad, grad))

        return self.updates

    def get_config(self):
        config = {
            'init_alpha': float(K.get_value(self.init_alpha)),
            'scale_up': float(K.get_value(self.scale_up)),
            'scale_down': float(K.get_value(self.scale_down)),
            'min_alpha': float(K.get_value(self.min_alpha)),
            'max_alpha': float(K.get_value(self.max_alpha)),
        }
        base_config = super(RProp, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

重要说明:

  • RProp通常由于以下原因而未包含在机器学习库中:除非您使用全批次学习,否则它根本不起作用.而且,只有在您进行少量培训的情况下,全批次学习才有用.
  • Adam(内置Keras)优于该RProp算法.也许是因为那样,或者是因为我弄错了:)

关于代码的一些注释(指的是原始变量名):

A few comments about your code (referring to your original variable names):

  • wChange从未在迭代中使用,因此您无需将其存储在永久变量中.
  • change > 0不会执行您认为的操作,因为change是张量变量.您想要的是按元素进行比较,请改为使用K.switch().
  • 您两次使用maxStep而不是另一次使用minStep.
  • change为零的情况可以忽略不计,因为在实践中几乎从未发生过这种情况.
  • g_new = 0g_new = p都是伪造的,应与第一个if分支中的一样为g_new = g.
  • wChange is never used across iterations, so you don't need to store those in permanent variables.
  • change > 0 does not do what you think it does because change is a tensor variable. What you want here is a element-wise comparison, use K.switch() instead.
  • You used maxStep twice instead of using minStep the other time.
  • The situation where change is zero is negligible, since that almost never happens in practice.
  • g_new = 0 and g_new = p are both completely bogus and should be g_new = g as in the first if branch.

这篇关于在Keras中实现Rprop算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆