如何使用fp16和fp32训练模型进行推理? [英] How to inference using fp16 with a fp32 trained model?

查看：1590 发布时间：2020/6/12 19:25:47 tensorflow precision

本文介绍了如何使用fp16和fp32训练模型进行推理?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想推断一个使用fp16的fp32模型，以验证半精度结果.加载检查点后，可以将这些参数转换为float16，然后如何在会话中使用这些fp16参数?

I want to inference with a fp32 model using fp16 to verify the half precision results. After loading checkpoint, the params can be converted to float16, then how to use these fp16 params in session?

reader = tf.train.NewCheckpointReader(model_file)
var_to_map = reader.get_variable_to_dtype_map()

for key, val in var_to_map.items():
    tsr = reader.get_tensor(key)
    val_f16 = tf.cast(tsr, tf.float16)

# sess.restore() ???

推荐答案

我找到了一种实现它的方法.

I found a method to realize it.

使用 tf.train.NewCheckpointReader()加载检查点，然后读取参数并将其转换为float16类型.
使用float16读取参数初始化图层

load checkpoint with tf.train.NewCheckpointReader(), then read params and convert them to float16 type.
use float16 read params to initialize layers

    weight_name = scope_name + '/' + get_layer_str() + '/' + 'weight'
    initw = inits[weight_name]
    weight = tf.get_variable('weight', dtype=initw.dtype, initializer=initw)
    out = tf.nn.conv2d(self.get_output(), weight, strides=[1, stride, stride, 1], padding='SAME')

运行图形

我的GPU是没有张量核心的GTX1080，但是使用fp16的推理比使用fp32的推理快20％-30％，我不明白原因，并且使用了哪个硬件单元"计算fp16，fp32的传统单位是吗?

My GPU was GTX1080 without tensor core, but inference with fp16 is faster than with fp32 by 20%-30%, I don't understand the reason, and which "hardware units" was used to calc fp16, is traditional units for fp32?

这篇关于如何使用fp16和fp32训练模型进行推理?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用fp16和fp32训练模型进行推理? [英] How to inference using fp16 with a fp32 trained model?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何使用fp16和fp32训练模型进行推理? [英] How to inference using fp16 with a fp32 trained model?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭