使用tf.function的Tensorflow 2.0模型非常慢,并且每次火车数量变化时都会重新编译.渴望的速度快大约4倍 [英] Tensorflow 2.0 model using tf.function very slow and is recompiling every time the train count changes. Eager runs about 4x faster

查看:506
本文介绍了使用tf.function的Tensorflow 2.0模型非常慢,并且每次火车数量变化时都会重新编译.渴望的速度快大约4倍的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有使用未编译的keras代码构建的模型,并试图通过自定义训练循环运行它们.

I have models built from uncompiled keras code and am trying to run them through a custom training loop.

TF 2.0急切(默认)代码在CPU(笔记本电脑)上运行大约30秒.当我使用包装的tf.function调用方法创建一个keras模型时,它的运行速度大大降低,并且似乎需要很长时间才能启动,尤其是第一次"启动.

The TF 2.0 eager (by default) code runs about 30s on a CPU (laptop). When I create a keras model with wrapped tf.function call methods, it is running much, much slower and appears to take a very long time to start, particularly the "first" time.

例如,在tf.function代码中,对10个样本的初始训练需要40s,而对10个样本的后续训练则需要2s.

For example, in the tf.function code the initial train on 10 samples takes 40s, and the follow up one on 10 samples takes 2s.

在20个样本上,初始样本需要50秒,后续样本需要4秒.

On 20 samples, the initial takes 50s and the follow up takes 4s.

对一个样本进行的第一轮训练需要2秒钟,而后续动作则需要200毫秒.

The first train on 1 sample takes 2s and follow up takes 200 ms.

因此,每次火车调用似乎都在创建一个新图,其中复杂度随火车数量成比例!?

So it looks like each call of train is creating a new graph where the complexity scales with the train count!?

我只是在做这样的事情:

I am just doing something like this:

@tf.function
def train(n=10):
    step = 0
    loss = 0.0
    accuracy = 0.0
    for i in range(n):
        step += 1
        d, dd, l = train_one_step(model, opt, data)
        tf.print(dd)
        with tf.name_scope('train'):
            for k in dd:
                tf.summary.scalar(k, dd[k], step=step)
        if tf.equal(step % 10, 0):
            tf.print(dd)
    d.update(dd)
    return d

其中的模型为keras.model.Model,并按照示例使用@tf.function装饰call方法.

Where the model is keras.model.Model with a @tf.function decorate call method as per the examples.

推荐答案

我在简而言之:tf.function的设计不会自动将Python本机类​​型装箱到具有明确定义的dtypetf.Tensor对象.

In short: the design of tf.function does not automatically do the boxing of Python native types to tf.Tensor objects with a well-defined dtype.

如果您的函数接受tf.Tensor对象,则在第一次调用时将分析该函数,然后将构建图形并将其与该函数相关联.在每个非首次调用中,如果tf.Tensor对象的dtype匹配,则图形将被重用.

If your function accepts a tf.Tensor object, on the first call the function is analyzed, the graph is built and associated with that function. In every non-first call, if the dtype of the tf.Tensor object matches, the graph is reused.

但是如果使用Python本机类​​型,则每次使用不同的值调用该函数时都会构建graphg .

But in case of using a Python native type, the graphg is being built every time the function is invoked with a different value.

简而言之:如果计划使用@tf.function,则将代码设计为在所有地方都使用tf.Tensor而不是Python变量.

In short: design your code to use tf.Tensor everywhere instead of the Python variables if you plan to use @tf.function.

tf.function不能神奇地加速在急切模式下运行良好的功能的包装器;是一个包装,需要设计eager函数(主体,输入参数,dytpes),以了解创建图形后将发生的情况,从而获得真正的加速效果.

tf.function is not a wrapper that magically accelerates a function that works well in eager mode; is a wrapper that requires to design the eager function (body, input parameters, dytpes) understanding what will happen once the graph is created, in order to get real speed ups.

这篇关于使用tf.function的Tensorflow 2.0模型非常慢,并且每次火车数量变化时都会重新编译.渴望的速度快大约4倍的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆