删除未使用的图像对象 [英] Remove unused image objects

查看:216
本文介绍了删除未使用的图像对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有使用合成工具创建的PDF文件来生成财务报表。

I have PDF files that are being created with a composition tool to produce financial statements.

使用全局图像,每个文件的PDF文件大约为5000 - 10000页最大限度提高空间效率的资源。

The PDF files are around the 5000 - 10000 pages per file using global image resources to maximise space efficiences.

这些陈述包括营销图片。其中许多(约3mb),并非每个特定语句都使用所有图像。

These statements include marketing images. Many of them (about 3mb worth), not every particular statements uses all the images.

当我使用为此目的开发的工具提取PDF文件时(或者如果我仅仅为了测试目的使用adobe acrobat) - 要在PDF文件的开头提取空白页面,生成的提取的PDF大约是3mb。审核空间使用量会看到它由3mb的图像组成。

When I extract the PDF file using a tool that has been developed for this purpose (or if I use adobe acrobat just for testing purposes) - to extract a blank page at the start of the PDF file, the resulting extracted PDF is around the 3mb. Auditing the space usage sees that it is comprised of 3mb of images.

使用iTextSharp(最新的5.4.4)我试图遍历每个页面并复制到一个编写器调用reader.RemoveUnusedObjects。但是这并没有减小尺寸。

Using iTextSharp (latest 5.4.4) I have attempted to iterate through each page and copy to a writer calling reader.RemoveUnusedObjects. But this does not reduce the size.

我还发现了另一个使用pdfstamper并尝试相同的例子。相同的结果。

I also found another example to use a pdfstamper and tried the same thing. Same result.

我也尝试过设置最大压缩和SetFullCompression。两者都没有任何区别。

I've also tried setting maximum compression and SetFullCompression. Neither made any difference.

任何人都可以给我任何关于我可能会做什么的指示。我希望我可以做一个简单的练习而不必解析PDF文件中的对象并手动删除未使用的对象。

Can anyone give me any pointers for what I might do. I'm hoping I can do it as a simple exercise and not have to parse the objects in the PDF file and manually remove the unused ones.

代码如下:

iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(inputFile);

iTextSharp.text.Document document = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(1));
// step 2: we create a writer that listens to the document
// step 3: we open the document

iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(document, new System.IO.FileStream(outputFile, System.IO.FileMode.Create));
document.Open();
iTextSharp.text.pdf.PdfContentByte cb = pdfCpy.DirectContent;
//pdfCpy.NewPage();
int objects = reader.RemoveUnusedObjects();
reader.RemoveFields();
reader.RemoveAnnotations();
// we retrieve the total number of pages
int numberofPages = reader.NumberOfPages;

int i = 0;
while (i < numberofPages)
{
    i++;
    document.SetPageSize(reader.GetPageSizeWithRotation(i));
    document.NewPage();

    iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader, i);
    pdfCpy.SetFullCompression();
    reader.RemoveUnusedObjects();
    reader.RemoveFields();
    reader.RemoveAnnotations();
    int rotation = reader.GetPageRotation(i);
    if (rotation == 90 || rotation == 270)
    {
        cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
    }
    else
    {
        cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
    }
    pdfCpy.AddPage(page);

}
pdfCpy.NewPage();
pdfCpy.Add(new iTextSharp.text.Paragraph("This is added text"));

document.Close();
pdfCpy.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION;
pdfCpy.Close();
reader.Close();

压模示例:

iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(inputFile);
using (FileStream fs = new FileStream(outputFile + ".2" , FileMode.Create))
{
    iTextSharp.text.pdf.PdfStamper stamper = new iTextSharp.text.pdf.PdfStamper(reader, fs, iTextSharp.text.pdf.PdfWriter.VERSION_1_5);
    iTextSharp.text.pdf.PdfWriter writer = stamper.Writer;
    writer.SetPdfVersion(iTextSharp.text.pdf.PdfWriter.PDF_VERSION_1_5);
    writer.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION;
    reader.RemoveFields();
    reader.RemoveUnusedObjects();
    stamper.Reader.RemoveUnusedObjects();

    stamper.SetFullCompression();
    stamper.Writer.SetFullCompression();
    stamper.Close();
}
reader.Close();


推荐答案

尝试使用 iTextSharp.text .pdf.PdfSmartCopy 而不是 PdfCopy

对我而言,它减少了PDF大小约43MB PDF到~4MB。

For me it decreased a PDF with a size of ~43MB PDF to ~4MB.

这篇关于删除未使用的图像对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆