如何解码pdf的交叉引用流? [英] how to decode cross-reference stream of pdf?

查看:89
本文介绍了如何解码pdf的交叉引用流?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发pdf解析器,遇到了真正的麻烦,我无法自己理解它.

  6   0  obj
<  </  DecodeParms<</Columns     4/Predictor     12  > >/Filter/FlateDecode/ID [< 275EB65AA3259D4FB5B288864AD5DDFC>< 3476ED24B8B8844D840F1CF1600D5665>]/信息 23   0  R/长度59/Root  25   0  R/Size 24/Type/XRef/W [ 1   2   1 ]>>流
辀
结束流


我通过zlib在流"和结束流"之间压缩字节数据,但无法继续解析压缩后的数据,我阅读了pdf参考资料1.6,发现需要使用主函数,如何告诉我该怎么做? div class ="h2_lin">解决方案

http://zlib.net/ [ ^ ]使用解压缩功能来解压缩数据.

即使您在Windows中使用也不会有问题,完全不需要更改


鉴于此数据块使用了Predictor函数12的事实,我们可以在pdfReference 1.6的pg52上看到这对应于PNG预测(在编码上,所有行上都为PNG上)

在参考文档第51页的中间,我们可以看到PNG预测函数是在RFC 2083中定义的.

这是相关的部分: RFC 2083,PNG-第6部分过滤器(预测变量)功能 [ ^ ]


值得一看的是pngLib的代码或PNG写/读库的其他一些源代码发行版,以了解该算法是如何成功实现的.

我当然会对任何/所有进展都最感兴趣.我正在用C ++实现pdf创建类,由于缺乏有关如何创建PDF合法的lzw流的文档,我刚刚放弃了lzwDecode.找到了几个微型zlib实现-通常仅实现deflate方法和任何先决条件.
因此,这比我现在要走的更远,尽管确实希望将所有编码流的所有内容都压缩掉-似乎有时我也需要使用PNG预测器.

干杯!


i‘m developing a pdf parser,i have in real trouble,i can’t get it in myself.

6 0 obj
<</DecodeParms<</Columns 4/Predictor 12>>/Filter/FlateDecode/ID[<275EB65AA3259D4FB5B288864AD5DDFC><3476ED24B8B8844D840F1CF1600D5665>]/Info 23 0 R/Length 59/Root 25 0 R/Size 24/Type/XRef/W[1 2 1]>>stream
h辀b
endstream


i deflate in byte data between "stream" and "endstream" by zlib,but i cant continue parse the deflated data,i read the pdf reference 1.6,i find need to use pridictor funciton,how can tell me how to do it?

解决方案

download the source code from http://zlib.net/[^] use the decompression function to decompress the data.

Even if you use in windows it wont be a problem, no change is required at all


Given the fact that this data chunk uses Predictor function 12, we can see on pg52 of pdfReference 1.6 that this corresponds to PNG prediction (on encoding, PNG Up on all rows)

In the middle of page 51 of the reference doc, we can see that the PNG predictor functions are defined within RFC 2083.

Here''s the relevant section: RFC 2083, PNG - Section 6 Filter(ed: predictor) Functions[^]


It may be worth a look inside the code of pngLib or some other source-code distribution of a PNG writing/reading library, just to see how the algorithm has been successfully implemented.

I''d certainly be most interested in any/all progress. I''m Implementing a pdf creation class in c++, and have just abandoned lzwDecode owing to the lack of documentation on how to go about creating a PDF-legal lzw stream. Have found a couple of micro zlib implementations - typically only implementing the deflate method and any pre-requisites.
So, this is a little further up my alley than I am currently, though do hope to squish every last bit out of any encoded streams - it seems I''ll need to use PNG predictors too at some point.

Cheers!


这篇关于如何解码pdf的交叉引用流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆