TensorFlow优化器中的_get_hyper和_set_hyper是什么? [英] What are _get_hyper and _set_hyper in TensorFlow optimizers?
问题描述
例如,我在__init__
中看到了它 Adam 优化器:self._set_hyper('beta_1', beta_1)
.整个代码中还有_get_hyper
和_serialize_hyperparameter
.我没有在 Keras 中看到这些优化器-它们是可选的吗?创建自定义优化器时应何时使用或不应该使用它们?
I see it in __init__
of e.g. Adam optimizer: self._set_hyper('beta_1', beta_1)
. There are also _get_hyper
and _serialize_hyperparameter
throughout the code. I don't see these in Keras optimizers - are they optional? When should or shouldn't they be used when creating custom optimizers?
推荐答案
它们允许设置和获取Python文字(int
,str
等),可调用对象和张量.用法用于方便和一致性:通过_set_hyper
设置的任何内容都可以通过_get_hyper
检索,从而避免重复样板代码.我已经在所有主要TF& Keras版本,并将其用作参考.
They enable setting and getting Python literals (int
, str
, etc), callables, and tensors. Usage is for convenience and consistency: anything set via _set_hyper
can be retrieved via _get_hyper
, avoiding repeating boilerplate code. I've implemented Keras AdamW in all major TF & Keras versions, and will use it as reference.
-
t_cur
是tf.Variable
.每次我们设置"它时,我们都必须调用K.set_value
;如果我们执行self.t_cur=5
,这将破坏tf.Variable
并破坏优化器功能.相反,如果我们使用model.optimizer._set_hyper('t_cur', 5)
,则会对其进行适当设置-但这要求通过set_hyper
. -
两个
_get_hyper
&_set_hyper
启用对属性的编程处理-例如,我们可以使用_get_hyper
和_set_hyper
进行带有要获取或设置的属性名称列表的for循环,否则,我们可以需要编写条件代码和类型检查代码.另外,_get_hyper(name)
要求name
以前是通过set_hyper
设置的.
t_cur
is atf.Variable
. Each time we "set" it, we must invokeK.set_value
; if we doself.t_cur=5
, this will destroytf.Variable
and wreck optimizer functionality. If instead we usedmodel.optimizer._set_hyper('t_cur', 5)
, it'd set it appropriately - but this requires for it to have been defined viaset_hyper
previously.Both
_get_hyper
&_set_hyper
enable programmatic treatment of attributes - e.g., we can make a for-loop with a list of attribute names to get or set using just_get_hyper
and_set_hyper
, whereas otherwise we'd need to code conditionals and typechecks. Also,_get_hyper(name)
requires thatname
was previously set viaset_hyper
.
_get_hyper
通过dtype=
启用类型转换.例如:默认情况下,beta_1_t
的Adam类型转换为与var
相同的数字类型(例如,图层权重),这是某些操作所必需的.再次方便,因为我们可以手动键入(math_ops.cast
).
_get_hyper
enables typecasting via dtype=
. Ex: beta_1_t
in default Adam is cast to same numeric type as var
(e.g. layer weight), which is required for some ops. Again a convenience, as we could typecast manually (math_ops.cast
).
_set_hyper
允许使用 _serialize_hyperparameter
,它检索可调用,张量或已经存在Python值的Python值(int
,float
等).名称源于将张量和可调用对象转换为Pythonics的需求,例如酸洗或json序列化-但可以方便地在Graph执行中查看张量值.
_set_hyper
enables the use of _serialize_hyperparameter
, which retrieves the Python values (int
, float
, etc) of callables, tensors, or already-Python values. Name stems from the need to convert tensors and callables to Pythonics for e.g. pickling or json-serializing - but can be used as convenience for seeing tensor values in Graph execution.
最后;通过_set_hyper
实例化的所有内容都分配给optimizer._hyper
字典,然后在
Lastly; everything instantiated via _set_hyper
gets assigned to optimizer._hyper
dictionary, which is then iterated over in _create_hypers
. The else
in the loop casts all Python numerics to tensors - so _set_hyper
will not create int
, float
, etc attributes. Worth noting is the aggregation=
kwarg, whose documentation reads: "Indicates how a distributed variable will be aggregated". This is the part a bit more than "for convenience" (lots of code to replicate).
-
_set_hyper
具有限制:不允许实例化dtype
.如果add_weight
_create_hypers中的a>方法,然后应直接调用它.
_set_hyper
has a limitation: does not allow instantiatingdtype
. Ifadd_weight
approach in_create_hypers
is desired with dtype, then it should be called directly.
何时使用与不使用:如果该属性是由优化器通过TensorFlow ops使用的,则使用-即,如果该属性必须为tf.Variable
.例如, epsilon
是定期设置的,因为它永远不需要作为张量变量.
When to use vs. not use: use if the attribute is used by the optimizer via TensorFlow ops - i.e. if it needs to be a tf.Variable
. For example, epsilon
is set regularly, as it's never needed as a tensor variable.
这篇关于TensorFlow优化器中的_get_hyper和_set_hyper是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!