获得完全量化的 TfLite 模型，也可以在 int8 上输入和输出 [英] Get fully qunatized TfLite model, also with in- and output on int8

查看：154 发布时间：2021/9/5 19:36:38 tensorflow tensorflow-lite quantization

本文介绍了获得完全量化的 TfLite 模型，也可以在 int8 上输入和输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用 Tensorflow 1.15.3 量化了一个 Keras h5 模型(TF 1.13 ；keras_vggface 模型)，以便与 NPU 一起使用.我用于转换的代码是:

I quantize a Keras h5 model (TF 1.13 ; keras_vggface model) with Tensorflow 1.15.3, to use it with an NPU. The code I used for conversion is:

converter = tf.lite.TFLiteConverter.from_keras_model_file(saved_model_dir + modelname)  
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8  # or tf.uint8
converter.inference_output_type = tf.int8  # or tf.uint8
tflite_quant_model = converter.convert()

我得到的量化模型乍一看还不错.层的输入类型为int8，filter为int8，bias为int32，输出为int8.

The quantized model I get looks good on first sight. Input type of layers are int8, filter are int8, bias is int32, and output is int8.

然而，模型在输入层之后有一个量化层，输入层是 float32 [见下图].但似乎NPU也需要输入为int8.

However, the model has a quantize layer after the input layer and the input layer is float32 [See image below]. But it seems that the NPU needs also the input to be int8.

有没有一种方法可以在没有转换层的情况下完全量化，但也有 int8 作为输入?

如上所示，我使用了 :

As you see above, I used the :

 converter.inference_input_type = tf.int8
 converter.inference_output_type = tf.int8

编辑

来自用户 dtlam 的解决方案

即使模型仍然不能与 google NNAPI 一起运行，使用 TF 1.15.3 或 TF2.2.0 在 int8 中量化模型和输出的解决方案是，感谢 delan:

Even though the model still does not run with the google NNAPI, the solution to quantize the model with in and output in int8 using either TF 1.15.3 or TF2.2.0 is, thanks to delan:

...
converter = tf.lite.TFLiteConverter.from_keras_model_file(saved_model_dir + modelname) 
        
def representative_dataset_gen():
  for _ in range(10):
    pfad='pathtoimage/000001.jpg'
    img=cv2.imread(pfad)
    img = np.expand_dims(img,0).astype(np.float32) 
    # Get sample input data as a numpy array in a method of your choosing.
    yield [img]
    
converter.representative_dataset = representative_dataset_gen

converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.experimental_new_converter = True

converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.int8 
converter.inference_output_type = tf.int8 
quantized_tflite_model = converter.convert()
if tf.__version__.startswith('1.'):
    open("test153.tflite", "wb").write(quantized_tflite_model)
if tf.__version__.startswith('2.'):
    with open("test220.tflite", 'wb') as f:
        f.write(quantized_tflite_model)

获得完全量化的 TfLite 模型，也可以在 int8 上输入和输出 [英] Get fully qunatized TfLite model, also with in- and output on int8

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

获得完全量化的 TfLite 模型，也可以在 int8 上输入和输出 [英] Get fully qunatized TfLite model, also with in- and output on int8

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭