Tensorflow模型预测缓慢 [英] Tensorflow model prediction is slow
问题描述
我有一个带有单个 Dense
层的 TensorFlow
模型:
I have a TensorFlow
model with a single Dense
layer:
model = tf.keras.Sequential([tf.keras.layers.Dense(2)])
model.build(input_shape=(None, None, 25))
我在 float32
中构造了一个输入向量:
I construct a single input vector in float32
:
np_vec = np.array(np.random.randn(1, 1, 25), dtype=np.float32)
vec = tf.cast(tf.convert_to_tensor(np_vec), dtype=tf.float32)
我想将其提供给我的模型进行预测,但是速度非常慢.与在NumPy中执行相同的操作相比,如果我调用 predict
或 __ call __
会花费很长的时间.
I want to feed that to my model for prediction, but it is very slow.
If I call predict
or __call__
it takes a really long time, compared to doing the same operation in NumPy.
- 调用
%timeit model.predict(vec)
:
10个循环,最好是3:每个循环21.9毫秒
10 loops, best of 3: 21.9 ms per loop
%timeit模型(vec,training = False)
调用模型:
1000个循环,每个循环最好3:806 µs
1000 loops, best of 3: 806 µs per loop
weights = np.array(model.layers[0].get_weights()[0])
%timeit np_vec @ weights
1000000次循环,每个循环最好为3:1.27 µs
1000000 loops, best of 3: 1.27 µs per loop
100000个循环,每个循环中最好为3个:2.57 µs
100000 loops, best of 3: 2.57 µs per loop
Google Colab: https://colab.research.google.com/drive/1RCnTM24RUI4VkykVtdRtRdUVEkAHdu4A?usp = sharing
Google Colab: https://colab.research.google.com/drive/1RCnTM24RUI4VkykVtdRtRdUVEkAHdu4A?usp=sharing
如何使TensorFlow模型的推理时间更快?尤其是因为我不仅具有 Dense
层,而且还使用了 LSTM
,所以我不想在NumPy中重新实现它.
How can I make my TensorFlow model faster in inference time?
Especially because I don't only have a Dense
layer, but I also use an LSTM
and I don't want to reimplement that in NumPy.
推荐答案
整个故事隐藏在Keras中LSTM层的实现背后.Keras LSTM层具有默认参数 unroll = False
.这将导致LSTM运行符号循环(循环会花费更多时间).尝试向LSTM添加一个额外的参数,例如 unroll = True
.
The whole story lies behind the implementation of the LSTM layer in Keras. The Keras LSTM layer has a default argument unroll=False
. This causes the LSTM to run a symbolic loop (loop causes more time). Try adding an extra argument to the LSTM as unroll=True
.
tf.keras.layers.LSTM(64, return_sequences=True, stateful=True, unroll=True)
这可能会导致速度提高2倍(使用%timeit模型(vec,training = False)
在我的机器上进行了测试).但是,使用 unroll = True
可能会导致对更大的序列使用更多的ram.有关更多查询,请查看Keras LSTM 文档.
This may result in up to a 2x speed boost up (tested on my machine, using %timeit model(vec, training=False)
). However, using unroll=True
may cause taking more ram for larger sequences. For more inquiry, please have a look at the Keras LSTM documentation.
这篇关于Tensorflow模型预测缓慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!