tf.data vs keras.utils.sequence性能 [英] tf.data vs keras.utils.sequence performance

查看:163
本文介绍了tf.data vs keras.utils.sequence性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试决定是使用现有的keras.utils.sequence模块还是切换到tf.data.据我了解,tf.data通过通过在CPU上进行预处理的GPU重叠训练来优化性能. /a>.但这与keras.utils.sequence和keras数据生成器相比如何?根据我在此处所读的内容似乎在做同样的事情.切换到tf.data有什么好处?

I'm trying to decide whether to use the existing keras.utils.sequence module or to switch to tf.data. From what I understand, tf.data optimizes performance by overlapping training on GPU with pre-processing on the CPU. But how does that compare to keras.utils.sequence and the keras data generator? From what I read here it seems that it's doing the same thing. Is there anything to gain by switching to tf.data ?

推荐答案

两种方法都将输入数据预处理与模型训练重叠. keras.utils.sequence通过运行多个Python进程来做到这一点,而 tf.data 通过运行多个C ++来做到这一点线程.

Both approaches overlap input data preprocessing with model training. keras.utils.sequence does this by running multiple Python processes, while tf.data does this by running multiple C++ threads.

如果您的预处理是通过非TensorFlow Python库(例如 PIL )完成的, keras.utils.sequence可能对您更好,因为需要多个进程来避免争用Python的全局解释器锁.

If your preprocessing is being done by a non-TensorFlow Python library such as PIL, keras.utils.sequence may work better for you since multiple processes are needed to avoid contention on Python's global interpreter lock.

如果您可以使用TensorFlow操作表示预处理,那么我希望tf.data会提供更好的性能.

If you can express your preprocessing using TensorFlow operations, I would expect tf.data to give better performance.

要考虑的其他一些事情:

Some other things to consider:

  • tf.data is the recommended approach for building scalable input pipelines for tf.keras
  • tf.data is used more widely than keras.utils.sequence, so it may be easier to search for help with getting good performance.

这篇关于tf.data vs keras.utils.sequence性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆