关于十六进制EXIF的问题 [英] Questions about EXIF in hexadecimal form

查看:183
本文介绍了关于十六进制EXIF的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解jpeg文件的EXIF标头部分(十六进制),以及如何理解它,以便我可以提取数据,特别是GPS信息。无论好坏,我使用VB.Net 2008(对不起,这是我现在可以掌握的)。我已经将第一个jpg的64K提取到一个字节数组,并对数据的排列方式有一个模糊的概念。使用EXIF规范文档版本2.2和2.3,我发现有些标签应该与文件中的实际字节序列相对应。我看到有一个GPS IFD值为8825(十六进制)。我搜索文件中的十六进制字符串8825(我知道它是两个字节88和25),然后我相信在8825之后有一串字节。我怀疑这些后续字节表示文件中的位置,由GPS数据将被定位。例如,我有以下十六进制字节,从88开始25:88 25 00 04 00 00 00 01 00 00 05 9A 00 00 07 14.我正在查找的字符串是否超过16个字节?我得到的印象是,在这一串数据中,它应该告诉我在哪里可以找到文件中的实际GPS数据。



看看 http://search.cpan.org/~bettelli/Image- MetaData-JPEG-0.153 / lib / Image / MetaData / JPEG / Structures.pod#Exif_and_DCT ,它在页面的中间部分说明:每个IFD块都是一个结构化的记录序列,在Exif行话中称为互操作性阵列。第0个IFD的开始由'IFD0_Pointer'值给出。 IFD的结构如下:

那么,什么是IFD0_Pointer?它是否与抵消有关?我认为一个偏移量是从一个开始点开始的很多字节。如果那是真的,那开始点在哪里?



感谢任何回应。



Dale

解决方案

我建议您阅读 Exif规格(PDF);很明显,很容易遵循。对于一个简短的入门,这里是我写的文章的摘要:






JPEG / Exif文件从图像标记(SOI)开始。 SOI由两个魔术字节 0xFF 0xD8 组成,将文件标识为JPEG文件。在SOI之后,有许多应用程序标记部分(APP0,APP1,APP2,APP3,...)通常包括元数据。

应用程序标记部分



每个APPn部分都以一个标记开始。对于APP0部分,标记为 0xFF 0xE0 ,对于APP1部分 0xFF 0xE1 ,依此类推。标记字节后面跟着两个字节,表示该部分的大小(不包括标记,包括大小字节)。长度字段后面是可变大小的应用程序数据。 APPn部分是连续的,因此您可以跳过整个部分(通过使用部分大小),直到找到您感兴趣的部分。APPn部分的内容各不相同,以下仅针对Exif APP1部分



Exif APP1部分



Exif元数据存储在 APP1中部分(可能有多个APP1部分)。 Exif APP1部分中的应用程序数据由Exif标记 0x45 0x78 0x69 0x66 0x00 0x00 (Exif \0\0 code>),TIFF标题和一些图片文件目录(IFD)部分。



TIFF标题



TIFF头部包含有关IFD部分的字节顺序和指向第0个IFD的指针的信息。如果字节顺序是little-endian或<$,则前两个字节是 0x49 0x49 (对于Intel,为 II )对于big-endian,c $ c> 0x4D 0x4D MM 为摩托罗拉)。以下两个字节是魔术字节 0x00 0x2A 42 ;))。以下四个重要字节会告诉您从TIFF头部开始的第0个IFD的偏移量。



重要提示: JPEG文件本身(您一直在阅读的内容)将始终采用大端格式。但是,IFD小节的字节顺序可能不同,需要转换(您知道上面TIFF标题中的字节顺序)。

图像文件目录



一旦你得到这些,你的指针指向第0个IFD部分,你就可以阅读实际的元数据了。其余的IFD在不同的地方被引用。 Exif IFD和GPS IFD的偏移量在第0个IFD字段中给出。在第0个IFD字段之后给出第一个IFD的偏移量。互操作性IFD的偏移量在Exif IFD中给出。

IFD仅仅是元数据字段的顺序记录。字段计数在IFD的前两个字节中给出。字段数是12字节的字段。在字段之后,从TIFF报头的开始到第一个IFD的开始有一个4字节的偏移量。该值仅对第0个IFD有意义。在此之后,有IFD数据部分。



IFD字段



字段是IFD的12个字节子部分部分。每个字段的前两个字节给出Exif标准中定义的标签ID。接下来的两个字节给出了字段数据的类型。对于字节 2 ,您将拥有 1 c $ c> ascii 3 for short uint16 ), 4 uint32 )等。检查Exif规范以获取完整列表。



以下四个字节可能有点令人困惑。对于字节数组( ascii 未定义类型),给出数组的字节长度。例如,对于Ascii字符串:Exif,计数将为5,包括空终止符。对于其他类型,这是字段组件的数量(例如,4个短语,3个有理数)。

在计数之后,我们有4个字节的字段值。但是,如果字段数据的长度超过4个字节,它将被存储在IFD数据部分中。在这种情况下,该值将是从TIFF头部开始到场数据开始的偏移量。例如,对于 long uint32 ,4个字节),这将是字段值。对于合理的 2 x uint32 ,8个字节),这将是8字节字段数据的偏移量。




这基本上是如何将元数据安排在JPEG / Exif文件中。要记住一些注意事项(记住根据需要转换字节顺序,偏移量从TIFF头部开始,跳到数据部分以读取长字段,...),但是格式非常容易阅读。以下是JPEG / Exif文件的颜色编码HEX视图。蓝色块表示SOI,橙色表示TIFF标题,绿色表示IFD大小和偏移字节,浅紫色块表示IFD字段,深色紫色块表示字段数据。


I am trying to understand the EXIF header portion of a jpeg file (in hex) and how to understand it so I can extract data, specifically GPS information. For better or worse, I am using VB.Net 2008 (sorry, it is what I can grasp right now). I have extracted the first 64K of a jpg to a byte array and have a vague idea of how the data is arranged. Using the EXIF specification documents, version 2.2 and 2.3, I see that there are tags, that are supposed to correspond to actual byte sequences in the file. I see that there is a "GPS IFD" that has a value of 8825 (in hex). I search for the hex string 8825 in the file (which I understand to be two bytes 88 and 25) and then I believe that there is a sequence of bytes following the 8825. I suspect that those subsequent bytes denote where in the file, by way of an offset, the GPS data would be located. For example, I have the following hex bytes, starting with 88 25: 88 25 00 04 00 00 00 01 00 00 05 9A 00 00 07 14. Is the string that I am looking for longer than 16 bytes? I get the impression that in this string of data, it should be telling me where to find the actual GPS data in the file.

Looking at http://search.cpan.org/~bettelli/Image-MetaData-JPEG-0.153/lib/Image/MetaData/JPEG/Structures.pod#Exif_and_DCT, halfway down the page, it talks about "Each IFD block is a structured sequence of records, called, in the Exif jargon, Interoperability arrays. The beginning of the 0th IFD is given by the 'IFD0_Pointer' value. The structure of an IFD is the following:"

So, what is an IFD0_Pointer? Does it have to do with an offset? I presume an offset is so many bytes from a beginning point. If that is true, where is that beginning point?

Thanks for any responses.

Dale

解决方案

I suggest you to read The Exif Specification (PDF); it is clear and quite easy to follow. For a short primer, here is the summary of an article I wrote:


A JPEG/Exif file starts with the start of the image marker (SOI). The SOI consists of two magic bytes 0xFF 0xD8, identifying the file as a JPEG file. Following the SOI, there are a number of Application Marker sections (APP0, APP1, APP2, APP3, ...) typically including metadata.

Application Marker Sections

Each APPn section starts with a marker. For the APP0 section, the marker is 0xFF 0xE0, for the APP1 section 0xFF 0xE1, and so on. Marker bytes are followed by two bytes for the size of the section (excluding the marker, including the size bytes). The length field is followed by variable size application data. APPn sections are sequential, so that you can skip entire sections (by using the section size) until you reach the one you are interested in. Contents of APPn sections vary, the following is for the Exif APP1 section only.

The Exif APP1 Section

Exif metadata is stored in an APP1 section (there may be more than one APP1 section). The application data in an Exif APP1 section consists of the Exif marker 0x45 0x78 0x69 0x66 0x00 0x00 ("Exif\0\0"), the TIFF header and a number of Image File Directory (IFD) sections.

The TIFF Header

The TIFF header contains information about the byte-order of IFD sections and a pointer to the 0th IFD. The first two bytes are 0x49 0x49 (II for Intel) if the byte-order is little-endian or 0x4D 0x4D (MM for Motorola) for big-endian. The following two bytes are magic bytes 0x00 0x2A (42 ;)). And the following four important bytes will tell you the offset to the 0th IFD from the start of the TIFF header.

Important: The JPEG file itself (what you have been reading until now) will always be in big-endian format. However, the byte-order of IFD subsections may be different, and need to be converted (you know the byte-order from the TIFF header above).

Image File Directories

Once you get this far, you have your pointer to the 0th IFD section and you are ready to read actual metadata. The remaining IFDs are referenced in different places. The offset to the Exif IFD and the GPS IFD are given in the 0th IFD fields. The offset to the first IFD is given after the 0th IFD fields. The offset to the Interoperability IFD is given in the Exif IFD.

IFDs are simply sequential records of metadata fields. The field count is given in the first two bytes of the IFD. Following the field count are 12-byte fields. Following the fields, there is a 4 byte offset from the start of the TIFF header to the start of the first IFD. This value is meaningful for only the 0th IFD. Following this, there is the IFD data section.

IFD Fields

Fields are 12-byte subsections of IFD sections. The first two-bytes of each field give the tag ID as defined in the Exif standard. The next two bytes give the type of the field data. You will have 1 for byte, 2 for ascii, 3 for short (uint16), 4 for long (uint32), etc. Check the Exif Specification for the complete list.

The following four bytes may be a little confusing. For byte arrays (ascii and undefined types), the byte length of the array is given. For example, for the Ascii string: "Exif", the count will be 5 including the null terminator. For other types, this is the number of field components (eg. 4 shorts, 3 rationals).

Following the count, we have the 4-byte field value. However, if the length of the field data exceeds 4 bytes, it will be stored in the IFD Data section instead. In this case, this value will be the offset from the start of the TIFF header to the start of the field data. For example, for a long (uint32, 4 bytes), this will be the field value. For a rational (2 x uint32, 8 bytes), this will be an offset to the 8-byte field data.


This is basically how metadata is arranged in a JPEG/Exif file. There are a few caveats to keep in mind (remember to convert the byte-order as needed, offsets are from the start of TIFF header, jump to data sections to read long fields, ...) but the format is quite easy to read. Following is the color-coded HEX view of a JPEG/Exif file. The blue block represents the SOI, orange is the TIFF header, green is the IFD size and offset bytes, light purple blocks are IFD fields and dark purple blocks are field data.

这篇关于关于十六进制EXIF的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆