TFLite 对视频输入的推断 [英] TFLite Inference on video input

查看:53
本文介绍了TFLite 对视频输入的推断的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 SSD tflite 检测模型,我在台式计算机上使用 Python 运行该模型.至于现在,我下面的脚本将单个图像作为推理的输入,并且工作正常:

I have an SSD tflite detection model that I am running with Python on a desktop computer. As for now, my script below takes a single image as an input for inference and it works fine:

    # Load TFLite model and allocate tensors.
    interpreter = tf.lite.Interpreter(model_path="model.tflite")
    interpreter.allocate_tensors()

    img_resized = Image.open(file_name)
    input_data = np.expand_dims(img_resized, axis=0)
    input_data = (np.float32(input_data) - input_mean) / input_std

    input_details = interpreter.get_input_details()
    output_details = interpreter.get_output_details()

    interpreter.set_tensor(input_details[0]['index'], input_data)
    interpreter.invoke()
    output_data = interpreter.get_tensor(output_details[0]['index'])

如何对 .mp4 视频作为输入运行推理?

How to run inference on a .mp4 video as an input?

是否也可以从该视频上检测到的对象中绘制边界框?

Is it also possible to draw bounding boxes from detected objects on that video?

推荐答案

回答关于对视频运行推理的第一个问题.这是您可以使用的代码.我为分类模型的推理编写了此代码,因此在您的情况下, output_data 变量的输出将采用边界框的形式,您必须使用 OpenCV 将它们映射到帧上,这也回答了您的第二个问题(绘制边界视频中的框).

To Answer your first question of running inference on a video. Here is the code that you can use. I made this code for the inference of classification model, So in your case the output of the output_data variable will be in the form of bounding boxes, you have to map them on the frames using OpenCV which answer your second question as well (drawing bounding boxes on the video).

import cv2
from PIL import Image
import numpy as np
import tensorflow as tf

def read_tensor_from_readed_frame(frame, input_height=224, input_width=224,
        input_mean=0, input_std=255):
  output_name = "normalized"
  float_caster = tf.cast(frame, tf.float32)
  dims_expander = tf.expand_dims(float_caster, 0);
  resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
  normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
  sess = tf.Session()
  result = sess.run(normalized)
  return result

def load_labels(label_file):
  label = []
  proto_as_ascii_lines = tf.gfile.GFile(label_file).readlines()
  for l in proto_as_ascii_lines:
    label.append(l.rstrip())
  return label

def VideoSrcInit(paath):
    cap = cv2.VideoCapture(paath)
    flag, image = cap.read()
    if flag:
        print("Valid Video Path. Lets move to detection!")
    else:
        raise ValueError("Video Initialization Failed. Please make sure video path is valid.")
    return cap

def main():
  Labels_Path = "labels.txt"
  Model_Path = "model.tflite"
  input_path = "video.mp4"

  ##Loading labels
  labels = load_labels(Labels_Path)

  ##Load tflite model and allocate tensors
  interpreter = tf.lite.Interpreter(model_path=Model_Path)
  interpreter.allocate_tensors()
  # Get input and output tensors.
  input_details = interpreter.get_input_details()
  output_details = interpreter.get_output_details()

  input_shape = input_details[0]['shape']

  ##Read video
  cap = VideoSrcInit(input_path)

  while True:
    ok, cv_image = cap.read()
    if not ok:
      break

    ##Converting the readed frame to RGB as opencv reads frame in BGR
    image = Image.fromarray(cv_image).convert('RGB')

    ##Converting image into tensor
    image_tensor = read_tensor_from_readed_frame(image ,224, 224)

    ##Test model
    interpreter.set_tensor(input_details[0]['index'], image_tensor)
    interpreter.invoke()
    output_data = interpreter.get_tensor(output_details[0]['index'])

    ## You need to check the output of the output_data variable and 
    ## map it on the frame in order to draw the bounding boxes.


    cv2.namedWindow("cv_image", cv2.WINDOW_NORMAL)
    cv2.imshow("cv_image",cv_image)

    ##Use p to pause the video and use q to termiate the program
    key = cv2.waitKey(10) & 0xFF
    if key == ord("q"):
      break
    elif key == ord("p"):
      cv2.waitKey(0)
      continue 
  cap.release()

if __name__ == '__main__':
  main()

这篇关于TFLite 对视频输入的推断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆