为什么使用iTextSharp的作为FlateDecode解码时,我的图像失真? [英] Why is my image distorted when decoding as FlateDecode using iTextSharp?

查看:1143
本文介绍了为什么使用iTextSharp的作为FlateDecode解码时,我的图像失真?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当通过一个iTextSharp的PDF作为 FlateDecode 内解码图像的图像失真,我似乎无法找出原因。

When decoding an image within a PDF as FlateDecode via iTextSharp the image is distorted and I can't seem to figure out why.

公认BPP是 Format1bppIndexed 。如果我修改的PixelFormat Format4bppIndexed 图像识别到一定程度(缩水,着色关闭,但可读)和以水平方式被复制的4倍。如果我调整像素格式为 Format8bppIndexed 也可识别到一定程度,并以水平的方式被复制的8倍。

The recognized bpp is Format1bppIndexed. If I modify the PixelFormat to Format4bppIndexed the image is recognizable to some degree (shrunk, coloring is off but readable) and is duplicated 4 times in a horizontal manner. If I adjust the pixel format to Format8bppIndexed it is also recognizable to some degree and is duplicated 8 times in a horizontal manner.

下面的图片是一个 Format1bppIndexed 像素格式的方法了。不幸的是,我无法显示,由于安全限制他人。

The image below is after a Format1bppIndexed pixel format approach. Unfortunately I am unable to show the others due to security constraints.

中的代码被认为是下面这基本上是单一的解决办法,我所遇到的周围都SO和Web散落。

The code is seen below which is essentially the single solution I have come across littered around both SO and the web.

int xrefIdx = ((PRIndirectReference)obj).Number;
PdfObject pdfObj = doc.GetPdfObject(xrefIdx);
PdfStream str = (PdfStream)(pdfObj);
byte[] bytes = PdfReader.GetStreamBytesRaw((PRStream)str);

string filter = ((PdfArray)tg.Get(PdfName.FILTER))[0].ToString();
string width = tg.Get(PdfName.WIDTH).ToString();
string height = tg.Get(PdfName.HEIGHT).ToString();
string bpp = tg.Get(PdfName.BITSPERCOMPONENT).ToString();

if (filter == "/FlateDecode")
{
   bytes = PdfReader.FlateDecode(bytes, true);

   System.Drawing.Imaging.PixelFormat pixelFormat;
   switch (int.Parse(bpp))
   {
      case 1:
         pixelFormat = System.Drawing.Imaging.PixelFormat.Format1bppIndexed;
         break;
      case 8:
         pixelFormat = System.Drawing.Imaging.PixelFormat.Format8bppIndexed;
         break;
      case 24:
         pixelFormat = System.Drawing.Imaging.PixelFormat.Format24bppRgb;
         break;
      default:
         throw new Exception("Unknown pixel format " + bpp);
   }

   var bmp = new System.Drawing.Bitmap(Int32.Parse(width), Int32.Parse(height), pixelFormat);
   System.Drawing.Imaging.BitmapData bmd = bmp.LockBits(new System.Drawing.Rectangle(0, 0, Int32.Parse(width),
             Int32.Parse(height)), System.Drawing.Imaging.ImageLockMode.WriteOnly, pixelFormat);
   Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length);
   bmp.UnlockBits(bmd);
   bmp.Save(@"C:\temp\my_flate_picture-" + DateTime.Now.Ticks.ToString() + ".png", ImageFormat.Png);
}



什么我必须这样做,我的图像提取作品期望时,处理 FlateDecode

注意:我不想使用另一个库提取图像。我要寻找一个解决方案杠杆的只有的iTextSharp的和.NET FW。如果通过Java(iText的)存在解决方案,是很容易移植到这就够以及.NET FW位

NOTE: I do not want to use another library to extract the images. I am looking for a solution leveraging ONLY iTextSharp and the .NET FW. If a solution exists via Java (iText) and is easily portable to .NET FW bits that would suffice as well.

更新:在 ImageMask 属性设置为true,这将意味着没有色彩空间,因此隐含黑色和白色。随着BPP 1进来,在的PixelFormat Format1bppIndexed 这正如前面提到的,产生嵌入图像上面看到

UPDATE: The ImageMask property is set to true, which would imply that there is no color space and is therefore implicitly black and white. With the bpp coming in at 1, the PixelFormat should be Format1bppIndexed which as mentioned earlier, produces the embedded image seen above.

更新:要获取图像大小我提取出来使用Acrobat X PRO和这个特殊的例子图像大小被列为2403x3005 。当通过iTextSharp的提取大小被列为2544x3300。我修改了调试器中的图像大小在调用 Marshal.Copy(字节,0,bmd.Scan0,bytes.Length),以反映2403x3005然而; 我得到一个异常引发。

UPDATE: To get the image size I extracted it out using Acrobat X Pro and the image size for this particular example was listed as 2403x3005. When extracting via iTextSharp the size was listed as 2544x3300. I modified the image size within the debugger to mirror 2403x3005 however upon calling Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length); I get an exception raised.

尝试读取或写入受保护的内存。这通常是一个
指示其他内存已损坏。

Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

我的假设是,这是由于该修改。大小,从而不再对应于正在使用的字节数据。

My assumption is that this is due to the modification of the size and thus no longer corresponding to the byte data that is being used.

更新:每吉米的建议,我证实,调用 PdfReader.GetStreamBytes 返回一个byte []长度等于宽*高/ 8,因为 GetStreamBytes 应该叫 FlateDecode 。手动调用 FlateDecode 并调用 PdfReader.GetStreamBytes 都产生了1049401字节[]的长度,而宽*高/ 8是2544 *八分之三千三百或1049400,所以不知道这将是根本原因还是不行,一个关断的1的差;但是我不知道如何解决,如果这是事实确实如此

UPDATE: Per Jimmy's recommendation, I verified that calling PdfReader.GetStreamBytes returns a byte[] length equal to width*height/8 since GetStreamBytes should be calling FlateDecode. Manually calling FlateDecode and calling PdfReader.GetStreamBytes both produced a byte[] length of 1049401, while the width*height/8 is 2544*3300/8 or 1049400, so there is a difference of 1. Not sure if this would be the root cause or not, an off by one; however I am not sure how to resolve if that is indeed the case.

更新:在试图通过kuujinbo提到的方法,我会见了一个 IndexOutOfRangeException 当我尝试调用 renderInfo.GetImage(); RenderImage 监听器。事实上,宽*高/ 8如前所述是关闭的1相比,字节[]长度打电话时 FlateDecode 让我觉得这些都是在同一个;然而,解决方案仍然逃避我。

UPDATE: In trying the approach mentioned by kuujinbo I am met with an IndexOutOfRangeException when I attempt to call renderInfo.GetImage(); within the RenderImage listener. The fact that the width*height/8 as stated earlier is off by 1 in comparison to the byte[] length when calling FlateDecode makes me think these are all one in the same; however a solution still eludes me.

   at System.util.zlib.Adler32.adler32(Int64 adler, Byte[] buf, Int32 index, Int32 len)
   at System.util.zlib.ZStream.read_buf(Byte[] buf, Int32 start, Int32 size)
   at System.util.zlib.Deflate.fill_window()
   at System.util.zlib.Deflate.deflate_slow(Int32 flush)
   at System.util.zlib.Deflate.deflate(ZStream strm, Int32 flush)
   at System.util.zlib.ZStream.deflate(Int32 flush)
   at System.util.zlib.ZDeflaterOutputStream.Write(Byte[] b, Int32 off, Int32 len)
   at iTextSharp.text.pdf.codec.PngWriter.WriteData(Byte[] data, Int32 stride)
   at iTextSharp.text.pdf.parser.PdfImageObject.DecodeImageBytes()
   at iTextSharp.text.pdf.parser.PdfImageObject..ctor(PdfDictionary dictionary, Byte[] samples)
   at iTextSharp.text.pdf.parser.PdfImageObject..ctor(PRStream stream)
   at iTextSharp.text.pdf.parser.ImageRenderInfo.PrepareImageObject()
   at iTextSharp.text.pdf.parser.ImageRenderInfo.GetImage()
   at cyos.infrastructure.Core.MyImageRenderListener.RenderImage(ImageRenderInfo renderInfo)

更新:尝试改变我原来的解决方案,在这里列出以及由kuujinbo用不同的页面在PDF中提出的解决方案不同的方法产生的图像;但是问题总是面当过滤类型为 / FlateDecode 和没有图像生产为给定的实例。

UPDATE: Trying varying the varying methods listed here in my original solution as well as the solution posed by kuujinbo with a different page in the PDF produces imagery; however the issues always surface when the the filter type is /FlateDecode and no image is produced for that given instance.

推荐答案

尝试通过行复制的数据行,也许它会解决这个问题。

Try copy your data row by row, maybe it will solve the problem.

int w = imgObj.GetAsNumber(PdfName.WIDTH).IntValue;
int h = imgObj.GetAsNumber(PdfName.HEIGHT).IntValue;
int bpp = imgObj.GetAsNumber(PdfName.BITSPERCOMPONENT).IntValue;
var pixelFormat = PixelFormat.Format1bppIndexed;

byte[] rawBytes = PdfReader.GetStreamBytesRaw((PRStream)imgObj);
byte[] decodedBytes = PdfReader.FlateDecode(rawBytes);
byte[] streamBytes = PdfReader.DecodePredictor(decodedBytes, imgObj.GetAsDict(PdfName.DECODEPARMS));
// byte[] streamBytes = PdfReader.GetStreamBytes((PRStream)imgObj); // same result as above 3 lines of code.

using (Bitmap bmp = new Bitmap(w, h, pixelFormat))
{
    var bmpData = bmp.LockBits(new Rectangle(0, 0, w, h), ImageLockMode.WriteOnly, pixelFormat);
    int length = (int)Math.Ceiling(w * bpp / 8.0);
    for (int i = 0; i < h; i++)
    {
        int offset = i * length;
        int scanOffset = i * bmpData.Stride;
        Marshal.Copy(streamBytes, offset, new IntPtr(bmpData.Scan0.ToInt32() + scanOffset), length);
    }
    bmp.UnlockBits(bmpData);

    bmp.Save(fileName);
}

这篇关于为什么使用iTextSharp的作为FlateDecode解码时,我的图像失真?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆