在python中读取PASCAL VOC注释 [英] Reading PASCAL VOC annotations in python

查看:34
本文介绍了在python中读取PASCAL VOC注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 xml 文件中有注释,比如这个,它遵循 PASCAL VOC 约定:

I have annotations in xml files such as this one, which follows the PASCAL VOC convention:

<annotation>
<folder>training</folder>
<filename>chanel1.jpg</filename>
<source>
<database>synthetic initialization</database>
<annotation>PASCAL VOC2007</annotation>
<image>synthetic</image>
<flickrid>none</flickrid>
</source>
<owner>
<flickrid>none</flickrid>
<name>none</name>
</owner>
<size>
<width>640</width>
<height>427</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>chanel</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>344</xmin>
<ymin>10</ymin>
<xmax>422</xmax>
<ymax>83</ymax>
</bndbox>
</object>
<object>
<name>chanel</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>355</xmin>
<ymin>165</ymin>
<xmax>443</xmax>
<ymax>206</ymax>
</bndbox>
</object>
</annotation>

在 Python 中检索例如字段 filenamebndbox 的最简洁方法是什么?

What is the cleanest way of retrieving for example the fields filename and bndbox in Python?

我正在尝试使用 ElementTree,这似乎是 Python 的官方解决方案,但我无法使其工作.

I was trying to ElementTree, which seems to be the official Python solution, but I can't make it work.

到目前为止我的代码:

from xml.etree import ElementTree as ET
tree = ET.parse("data/all/annotations/" + file)
fn = tree.find('filename').text
boxes = tree.findall('bndbox')

这会产生

fn == 'chanel1.jpg'
boxes == []

因此它成功提取了 filename 字段,但没有提取 bndbox 字段.

So it succesfully extracts the filename field, but not the bndbox'es.

推荐答案

对于您的问题,这是一个非常简单的解决方案:

That's a quite easy solution for your problem:

这将在嵌套列表 [xmin, ymin, xmax, ymax] 和文件名中返回您的框坐标一旦我在混淆(ymin,xmin,...)或任何其他奇怪组合的 bndbox 标签中挣扎,因此此代码不仅读取位置,还读取标签.

This will return your box coordinates in a nested list [xmin, ymin, xmax, ymax] and the filename Once I struggled with bndbox tags which where mixed up (ymin, xmin,...) or any other strange combinations, so this code read the tags not only the position.

最后我更新了代码.感谢 craq 和 Pritesh Gohil,你说得完全正确.

Finally I updated the code. Thanks to craq and Pritesh Gohil, you were absolutely right.

希望能帮到你...

import xml.etree.ElementTree as ET


def read_content(xml_file: str):

    tree = ET.parse(xml_file)
    root = tree.getroot()

    list_with_all_boxes = []

    for boxes in root.iter('object'):

        filename = root.find('filename').text

        ymin, xmin, ymax, xmax = None, None, None, None

        ymin = int(boxes.find("bndbox/ymin").text)
        xmin = int(boxes.find("bndbox/xmin").text)
        ymax = int(boxes.find("bndbox/ymax").text)
        xmax = int(boxes.find("bndbox/xmax").text)

        list_with_single_boxes = [xmin, ymin, xmax, ymax]
        list_with_all_boxes.append(list_with_single_boxes)

    return filename, list_with_all_boxes

name, boxes = read_content("file.xml")

这篇关于在python中读取PASCAL VOC注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆