Tensorflow模型预测缓慢 [英] Tensorflow model prediction is slow

查看:523
本文介绍了Tensorflow模型预测缓慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有单个 Dense 层的 TensorFlow 模型:

I have a TensorFlow model with a single Dense layer:

model = tf.keras.Sequential([tf.keras.layers.Dense(2)])
model.build(input_shape=(None, None, 25))

我在 float32 中构造了一个输入向量:

I construct a single input vector in float32:

np_vec = np.array(np.random.randn(1, 1, 25), dtype=np.float32)
vec = tf.cast(tf.convert_to_tensor(np_vec), dtype=tf.float32)

我想将其提供给我的模型进行预测,但是速度非常慢.与在NumPy中执行相同的操作相比,如果我调用 predict __ call __ 会花费很长的时间.

I want to feed that to my model for prediction, but it is very slow. If I call predict or __call__ it takes a really long time, compared to doing the same operation in NumPy.

  1. 调用%timeit model.predict(vec):

10个循环,最好是3:每个循环21.9毫秒

10 loops, best of 3: 21.9 ms per loop

  • 直接按%timeit模型(vec,training = False)调用模型:

    1000个循环,每个循环最好3:806 µs

    1000 loops, best of 3: 806 µs per loop

  • 自己执行乘法运算

  • Perform the multiplication operation myself

    weights = np.array(model.layers[0].get_weights()[0])   
    %timeit np_vec @ weights
    

    1000000次循环,每个循环最好为3:1.27 µs

    1000000 loops, best of 3: 1.27 µs per loop

  • 使用火炬自己执行乘法

  • Perform the multiplication myself using torch

    100000个循环,每个循环中最好为3个:2.57 µs

    100000 loops, best of 3: 2.57 µs per loop

  • Google Colab: https://colab.research.google.com/drive/1RCnTM24RUI4VkykVtdRtRdUVEkAHdu4A?usp = sharing

    Google Colab: https://colab.research.google.com/drive/1RCnTM24RUI4VkykVtdRtRdUVEkAHdu4A?usp=sharing

    如何使TensorFlow模型的推理时间更快?尤其是因为我不仅具有 Dense 层,而且还使用了 LSTM ,所以我不想在NumPy中重新实现它.

    How can I make my TensorFlow model faster in inference time? Especially because I don't only have a Dense layer, but I also use an LSTM and I don't want to reimplement that in NumPy.

    推荐答案

    整个故事隐藏在Keras中LSTM层的实现背后.Keras LSTM层具有默认参数 unroll = False .这将导致LSTM运行符号循环(循环会花费更多时间).尝试向LSTM添加一个额外的参数,例如 unroll = True .

    The whole story lies behind the implementation of the LSTM layer in Keras. The Keras LSTM layer has a default argument unroll=False. This causes the LSTM to run a symbolic loop (loop causes more time). Try adding an extra argument to the LSTM as unroll=True.

    tf.keras.layers.LSTM(64, return_sequences=True, stateful=True, unroll=True)
    

    这可能会导致速度提高2倍(使用%timeit模型(vec,training = False)在我的机器上进行了测试).但是,使用 unroll = True 可能会导致对更大的序列使用更多的ram.有关更多查询,请查看Keras LSTM 文档.

    This may result in up to a 2x speed boost up (tested on my machine, using %timeit model(vec, training=False)). However, using unroll=True may cause taking more ram for larger sequences. For more inquiry, please have a look at the Keras LSTM documentation.

    这篇关于Tensorflow模型预测缓慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆