TensorFlow对象检测API教程中获取bounding box坐标 [英] Get the bounding box coordinates in the TensorFlow object detection API tutorial

查看:125
本文介绍了TensorFlow对象检测API教程中获取bounding box坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Python 和 Tensorflow 的新手.我正在尝试从 Tensorflow Object Detection API 运行 object_detection_tutorial 文件,但是当检测到物体时,我找不到在哪里可以获得边界框的坐标.

I am new to both python and Tensorflow. I am trying to run the object_detection_tutorial file from the Tensorflow Object Detection API, but I cannot find where I can get the coordinates of the bounding boxes when objects are detected.

相关代码:

 # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])

...

我假设绘制边界框的地方是这样的:

The place where I assume bounding boxes are drawn is like this:

 # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image_np)

我尝试打印 output_dict['detection_boxes'] 但我不确定这些数字是什么意思.有很多.

I tried printing output_dict['detection_boxes'] but I am not sure what the numbers mean. There are a lot.

array([[ 0.56213236,  0.2780568 ,  0.91445708,  0.69120586],
       [ 0.56261235,  0.86368728,  0.59286624,  0.8893863 ],
       [ 0.57073039,  0.87096912,  0.61292225,  0.90354401],
       [ 0.51422435,  0.78449738,  0.53994244,  0.79437423],

......

   [ 0.32784131,  0.5461576 ,  0.36972913,  0.56903434],
   [ 0.03005961,  0.02714229,  0.47211722,  0.44683522],
   [ 0.43143299, 0.09211366,  0.58121657,  0.3509962 ]], dtype=float32)

我找到了类似问题的答案,但我没有像他们那样有一个叫做 box 的变量.我怎样才能得到坐标?谢谢!

I found answers for similar questions, but I don't have a variable called boxes as they do. How can I get the coordinates? Thank you!

推荐答案

我尝试打印 output_dict['detection_boxes'] 但我不确定是什么数字意味着

I tried printing output_dict['detection_boxes'] but I am not sure what the numbers mean

您可以自己查看代码.visualize_boxes_and_labels_on_image_arrayhere.

You can check out the code for yourself. visualize_boxes_and_labels_on_image_array is defined here.

请注意,您正在传递 use_normalized_coordinates=True.如果您跟踪函数调用,您将看到您的数字 [ 0.56213236, 0.2780568 , 0.91445708, 0.69120586] 等是值 [ymin, xmin, ymax, xmax] 其中图像坐标:

Note that you are passing use_normalized_coordinates=True. If you trace the function calls, you will see your numbers [ 0.56213236, 0.2780568 , 0.91445708, 0.69120586] etc. are the values [ymin, xmin, ymax, xmax] where the image coordinates:

(left, right, top, bottom) = (xmin * im_width, xmax * im_width, 
                              ymin * im_height, ymax * im_height)

由函数计算:

def draw_bounding_box_on_image(image,
                           ymin,
                           xmin,
                           ymax,
                           xmax,
                           color='red',
                           thickness=4,
                           display_str_list=(),
                           use_normalized_coordinates=True):
  """Adds a bounding box to an image.
  Bounding box coordinates can be specified in either absolute (pixel) or
  normalized coordinates by setting the use_normalized_coordinates argument.
  Each string in display_str_list is displayed on a separate line above the
  bounding box in black text on a rectangle filled with the input 'color'.
  If the top of the bounding box extends to the edge of the image, the strings
  are displayed below the bounding box.
  Args:
    image: a PIL.Image object.
    ymin: ymin of bounding box.
    xmin: xmin of bounding box.
    ymax: ymax of bounding box.
    xmax: xmax of bounding box.
    color: color to draw bounding box. Default is red.
    thickness: line thickness. Default value is 4.
    display_str_list: list of strings to display in box
                      (each to be shown on its own line).
    use_normalized_coordinates: If True (default), treat coordinates
      ymin, xmin, ymax, xmax as relative to the image.  Otherwise treat
      coordinates as absolute.
  """
  draw = ImageDraw.Draw(image)
  im_width, im_height = image.size
  if use_normalized_coordinates:
    (left, right, top, bottom) = (xmin * im_width, xmax * im_width,
                                  ymin * im_height, ymax * im_height)

这篇关于TensorFlow对象检测API教程中获取bounding box坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆