Python:从PNG提取元数据 [英] Python: Extract Metadata from PNG

查看:84
本文介绍了Python:从PNG提取元数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我能够使用R提取必要的信息,但是为了在整个项目中保持一致,我希望能够使用Python(最好是Python3)来做到这一点.我需要一个名为设置"的标签的内容.此标记包含XML,然后需要对其进行解析.

I am able to extract the necessary information using R, but for consistency within the overall project, I would like to be able to do it with Python (preferably Python3). I need the contents of a single tag called "Settings". This tag contains XML which will then need to be parsed.

在R中获取元数据非常简单:

Getting the metadata in R is incredibly easy:

library(exifr)
library(XML)

path = file.path('path', 'to', 'file')

x = read_exif(file.path(path,'image.png'))
x$Settings

Python似乎无法做到这一点,这让我感到困惑.或这样做需要我比目前拥有更多的Python和PNG知识.如何使用Python提取PNG元数据?

It doesn't look like Python can do it, which boggles my mind. Or doing so requires me to have far more knowledge of Python and PNGs than I have at the moment. How can I extract PNG metadata using Python?

PyPng PyPNG似乎很有希望.检查每个块的长度,似乎"Settings"标签位于zTXt块中.

PyPng PyPNG seems promising. Examining the length of each chunk, it seems likely the "Settings" tag lives in the zTXt chunk.

import png

filename = "C:\\path\\to\\image.png"

im = png.Reader(filename)

for c in im.chunks():
    print(c[0], len(c[1]))

>>>
IHDR 13
tIME 7
pHYs 9
IDAT 47775
zTXt 714
IEND 0

以上内容摘自

The above was taken from this post. However, it's still unclear how to extract the zTXt data.

hach​​oir3

使用 hachoir3 程序包,我尝试了以下操作:

Using the hachoir3 package, I tried the following:

from hachoir.parser import createParser
from hachoir.metadata import extractMetadata

filename = "C:\\path\\to\\file\\image.png"
parser = createParser(filename)
metadata = extractMetadata(parser)

for line in metadata.exportPlaintext():
    print(line)

这给了我以下内容:

Metadata:
- Image width: 1024 pixels
- Image height: 46 pixels
- Bits/pixel: 16
- Pixel format: RGB
- Compression rate: 2.0x
- Image DPI width: 1 DPI
- Image DPI height: 1 DPI
- Creation date: 2016-07-13 19:09:28
- Compression: deflate
- MIME type: image/png
- Endianness: Big endian

我似乎无法到达我需要的字段,R代码中引用的设置".我没有其他方法的运气,例如metadata.get.据我所知,这似乎是解析PNG元数据的两个选项.文档阅读,

I can't seem to get at the field I need, the "Settings" one referenced in the R code. I've had no luck with other methods, such as metadata.get. As far as I can tell, those seem to be the two options for parsing PNG metadata. The docs read,

一些好的(但不是完美的;-)解析器:

Some good (but not perfect ;-)) parsers:

Matroska视频Microsoft RIFF(AVI视频,WAV音频,CDA文件)PNG 图片TAR和ZIP存档

Matroska video Microsoft RIFF (AVI video, WAV audio, CDA file) PNG picture TAR and ZIP archive

也许它只是没有我需要的功能?

Maybe it just doesn't have the functionality I need?

枕头

遵循此帖子中给出的建议:

from PIL import Image
filename = "C:\\path\\to\\file\\image.png"
im = Image.open(filename)

这将读取图像,但是im.info仅返回{'aspect': (1, 1)}.通读文档,看起来没有任何方法可以获取元数据.我通读了帖子中提供的 PNG描述.老实说,我既不知道如何利用其信息,也不知道Pillow如何为我提供帮助.

This reads in the image, but im.info only returns {'aspect': (1, 1)}. Reading through the documentation, it doesn't look like any of the methods get at the metadata. I read through the PNG description provided in the post. Honestly, I don't know how to make use of its information nor how Pillow would facilitate me.

有些帖子暗示我可以完成所需的操作,但它们不起作用.例如,这篇文章建议使用ExifTags库:

There are some posts which imply that what I need can be done, but they do not work. For example, this post suggests using the ExifTags library:

from PIL import Image, ExifTags
filename = "C:\\path\\to\\file\\image.png"
im = Image.open(filename)
exif = { ExifTags.TAGS[k]: v for k, v in im._getexif().items() if k in ExifTags.TAGS}

问题是AttributeError: 'PngImageFile' object has no attribute '_getexif'.根据文档._getexif该功能是实验性的,仅适用于JPG.

The problem is, AttributeError: 'PngImageFile' object has no attribute '_getexif'. According to the documentation, the ._getexif feature is experimental and only applies to JPGs.

通读整个枕头文档,它实际上只是在谈论JPG和TIFF.处理PNG文件似乎根本不是软件包的一部分.所以像hachoir一样,也许无法完成?

Reading through the overall Pillow documentation, it really only talks about JPG and TIFF. Processing PNG files doesn't seem to be part of the package at all. So like hachoir, maybe it can't be done?

PIL

显然还有另一个PIL软件包,是Pillow派生出来的.看来它在2009年被遗弃了.

There's apparently another package PIL from which Pillow was forked. It looks like it was abandoned in 2009.

推荐答案

这是一个优雅而笨拙但可行的解决方案.

Here is an inelegant and clumsy but working solution.

从此处改编: https ://motherboard.vice.com/en_us/article/aekn58/hack-this-extra-image-metadata-using-python

您可以从python中调用命令行exiftools应用,然后解析结果.

You can call the command line exiftools app from within python and then parse the results.

下面是在Ubuntu 16.04下的Python 3.6.3中工作的代码:

Below is the code which works in Python 3.6.3 under Ubuntu 16.04:

import subprocess

result = subprocess.run(['exiftool', '-h', '/home/jason/Pictures/kitty_mask.png'], stdout=subprocess.PIPE)
print (type(result))
print ("\n\n",result.stdout)
normal_string = result.stdout.decode("utf-8")
print("\n\n", normal_string)

它为我的测试图像产生以下结果:

It produces the following results for my test image:

> <class 'subprocess.CompletedProcess'>
> 
> 
>  b'<!-- /home/jason/Pictures/kitty_mask.png
> -->\n<table>\n<tr><td>ExifTool Version Number</td><td>10.80</td></tr>\n<tr><td>File
> Name</td><td>kitty_mask.png</td></tr>\n<tr><td>Directory</td><td>/home/jason/Pictures</td></tr>\n<tr><td>File
> Size</td><td>25 kB</td></tr>\n<tr><td>File Modification
> Date/Time</td><td>2018:07:02 09:35:00+01:00</td></tr>\n<tr><td>File
> Access Date/Time</td><td>2018:07:09
> 16:23:24+01:00</td></tr>\n<tr><td>File Inode Change
> Date/Time</td><td>2018:07:02 09:35:00+01:00</td></tr>\n<tr><td>File
> Permissions</td><td>rw-r--r--</td></tr>\n<tr><td>File
> Type</td><td>PNG</td></tr>\n<tr><td>File Type
> Extension</td><td>png</td></tr>\n<tr><td>MIME
> Type</td><td>image/png</td></tr>\n<tr><td>Image
> Width</td><td>2448</td></tr>\n<tr><td>Image
> Height</td><td>3264</td></tr>\n<tr><td>Bit
> Depth</td><td>8</td></tr>\n<tr><td>Color
> Type</td><td>RGB</td></tr>\n<tr><td>Compression</td><td>Deflate/Inflate</td></tr>\n<tr><td>Filter</td><td>Adaptive</td></tr>\n<tr><td>Interlace</td><td>Noninterlaced</td></tr>\n<tr><td>Image
> Size</td><td>2448x3264</td></tr>\n<tr><td>Megapixels</td><td>8.0</td></tr>\n</table>\n'
> 
> 
>  <!-- /home/jason/Pictures/kitty_mask.png --> <table> <tr><td>ExifTool
> Version Number</td><td>10.80</td></tr> <tr><td>File
> Name</td><td>kitty_mask.png</td></tr>
> <tr><td>Directory</td><td>/home/jason/Pictures</td></tr> <tr><td>File
> Size</td><td>25 kB</td></tr> <tr><td>File Modification
> Date/Time</td><td>2018:07:02 09:35:00+01:00</td></tr> <tr><td>File
> Access Date/Time</td><td>2018:07:09 16:23:24+01:00</td></tr>
> <tr><td>File Inode Change Date/Time</td><td>2018:07:02
> 09:35:00+01:00</td></tr> <tr><td>File
> Permissions</td><td>rw-r--r--</td></tr> <tr><td>File
> Type</td><td>PNG</td></tr> <tr><td>File Type
> Extension</td><td>png</td></tr> <tr><td>MIME
> Type</td><td>image/png</td></tr> <tr><td>Image
> Width</td><td>2448</td></tr> <tr><td>Image
> Height</td><td>3264</td></tr> <tr><td>Bit Depth</td><td>8</td></tr>
> <tr><td>Color Type</td><td>RGB</td></tr>
> <tr><td>Compression</td><td>Deflate/Inflate</td></tr>
> <tr><td>Filter</td><td>Adaptive</td></tr>
> <tr><td>Interlace</td><td>Noninterlaced</td></tr> <tr><td>Image
> Size</td><td>2448x3264</td></tr>
> <tr><td>Megapixels</td><td>8.0</td></tr> </table>

这篇关于Python:从PNG提取元数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆