逆向工程iWork '13格式 [英] Reverse engineering iWork '13 formats

查看:199
本文介绍了逆向工程iWork '13格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以前版本的Apple iWork套件使用了非常简单的文档格式:

Prior versions of Apple's iWork suite used a very simple document format:


  • 文档是资源束(文件夹,压缩或不压缩)

  • 该软件包包含一个 index.apxl [z] 文件,用于描述专有的但非常容易理解的模式中的文档结构

  • documents were Bundles of resources (folders, zipped or not)
  • the bundle contained an index.apxl[z] file describing the document structure in a proprietary but fairly easy to understand schema

iWork '13已完全重做格式。文档仍然是捆绑包,但索引XML文件中的内容现在编码在一组二进制文件中,类型后缀 .iwa 包装到索引中。 zip c>。

iWork '13 has completely redone the format. Documents are still bundles, but what was in the index XML file is now encoded in a set of binary files with type suffix .iwa packed into Index.zip.

例如,在Keynote中有以下 iwa

In Keynote, for example, there are the following iwa files:

AnnotationAuthorStorage.iwa
CalculationEngine.iwa
Document.iwa
DocumentStylesheet.iwa
MasterSlide-{n}.iwa
Metadata.iwa
Slide{m}.iwa
ThemeStylesheet.iwa
ViewState.iwa
Tables/DataList.iwa

MasterSlide s 1 ... n 幻灯片 s 1 ... m

for MasterSlides 1…n and Slides 1…m

这些都是从他们的命名很清楚。文件甚至出现未压缩,基本上所有内容文本直接作为二进制Blob中的字符串显示(虽然有一些像可读ASCII字符中间的RTF / NSAttributedString /类似相关的垃圾)。

The purpose of each of these is quite clear from their naming. The files even appear uncompressed, with essentially all content text directly visible as strings among the binary blobs (albeit with some like RTF/NSAttributedString/similar-related garbage in the midst of the readable ASCII characters).

我已经发布了一个简单示例Keynote文档的解压缩的索引 https://github.com/jrk/iwork-13-format

I have posted the unpacked Index of a simple example Keynote document here: https://github.com/jrk/iwork-13-format.

但是,显然对我。苹果有悠久的历史,使用简单的平台标准格式,如plists编码大多数的文档,但在文件的开头没有明确的类型标签,这是不明显的这些 iwa 文件。

However, the overall file format is non-obvious to me. Apple has a long history of using simple, platform-standard formats like plists for encoding most of their documents, but there is no clear type tag at the start of the files, and it is not obvious to me what these iwa files are.

这些文件是否响铃?是否有证据表明它们处于一些合理可理解的序列化格式?

Do these files ring any bells? Is there evidence they are in some reasonably comprehensible serialization format?

通过使用F-Script来转移Keynote应用运行时和类转储,我发现的唯一证据是一些在似乎用于iWork的序列化类中使用Protocol Buffers,例如: https://github.com/nst/iOS-Runtime-Headers/blob/master/PrivateFrameworks/iWorkImport.framework/TSPArchiverBase.h

Rummaging through the Keynote app runtime and class dumps with F-Script, the only evidence I've found is for some use of Protocol Buffers in the serialization classes which seem to be used for iWork, e.g.: https://github.com/nst/iOS-Runtime-Headers/blob/master/PrivateFrameworks/iWorkImport.framework/TSPArchiverBase.h.

通过 protoc --decode_raw 快速管理几个文件,第一个0 ... 16个字节被丢弃,没有任何明显的可用。

Quickly piping a few of the files through protoc --decode_raw with the first 0…16 bytes lopped off produced nothing obviously usable.

推荐答案

我已经做了一些工作,反向工程的格式和发布我的结果这里。我写了一份说明格式,并提供了一个示例项目。

I've done some work reverse engineering the format and published my results here. I've written up a description of the format and provided a sample project as well.

基本上,.iwa文件是使用Snappy压缩的Protobuf流。

Basically, the .iwa files are Protobuf streams compressed using Snappy.

希望这有助于!

这篇关于逆向工程iWork '13格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆