使用iTextSharp的PDF从图像中提取 [英] Extract image from PDF using itextsharp

查看：320 发布时间：2016/8/26 21:46:43 c# image pdf itextsharp

本文介绍了使用iTextSharp的PDF从图像中提取的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图提取所有使用iTextSharp的pdf文件图像，但似乎无法克服这一障碍。

错误就行了为System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream（MS）occures; 给人一个错误参数无效

我觉得它工作时，图像是一个位图，但不是任何其他格式。

我有这个以下code - 遗憾的长度;

 私人无效Form1_Load的（对象发件人，EventArgs的发送）
    {
        的FileStream FS = File.OpenRead（@reader.pdf）;
        字节[]数据=新的字节[fs.Length]
        fs.Read（数据，0，（int）的fs.Length）;        清单＆LT;＆System.Drawing.Image对象GT; ImgList =新的List＆LT;＆System.Drawing.Image对象GT;（）;        iTextSharp.text.pdf.RandomAccessFileOrArray RAFObj = NULL;
        iTextSharp.text.pdf.PdfReader PDFReaderObj = NULL;
        iTextSharp.text.pdf.PdfObject PDFObj = NULL;
        iTextSharp.text.pdf.PdfStream PDFStremObj = NULL;        尝试
        {
            RAFObj =新iTextSharp.text.pdf.RandomAccessFileOrArray（数据）;
            PDFReaderObj =新iTextSharp.text.pdf.PdfReader（RAFObj，NULL）;            的for（int i = 0; I＆LT; = PDFReaderObj.XrefSize  -  1;我++）
            {
                PDFObj = PDFReaderObj.GetPdfObject（ⅰ）;                如果（（PDFObj = NULL）及！＆安培; PDFObj.IsStream（））
                {
                    PDFStremObj =（iTextSharp.text.pdf.PdfStream）PDFObj;
                    iTextSharp.text.pdf.PdfObject亚型= PDFStremObj.Get（iTextSharp.text.pdf.PdfName.SUBTYPE）;                    如果（（亚型=空）及！＆放大器; subtype.ToString（）== iTextSharp.text.pdf.PdfName.IMAGE.ToString（））
                    {
                        字节[]字节= iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw（（iTextSharp.text.pdf.PRStream）PDFStremObj）;                        如果（（字节！= NULL））
                        {
                            尝试
                            {
                                System.IO.MemoryStream MS =新System.IO.MemoryStream（字节）;                                MS.Position = 0;
                                为System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream（MS）;                                ImgList.Add（ImgPDF）;                            }
                            赶上（例外）
                            {
                            }
                        }
                    }
                }
            }
            PDFReaderObj.Close（）;
        }
        赶上（异常前）
        {
            抛出新的异常（ex.Message）;
        }    } // Form1_Load的

解决方案

我已经使用这个库在过去，没有任何问题。它应该是正是你追求的。

http://www.winnovative-software.com/PdfImgExtractor.aspx

I am trying to extract all the images from a pdf using itextsharp but can't seem to overcome this one hurdle.

The error occures on the line System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream(MS); giving an error of "Parameter is not valid".

I think it works when the image is a bitmap but not of any other format.

I have this following code - sorry for the length;

    private void Form1_Load(object sender, EventArgs e)
    {
        FileStream fs = File.OpenRead(@"reader.pdf");
        byte[] data = new byte[fs.Length];
        fs.Read(data, 0, (int)fs.Length);

        List<System.Drawing.Image> ImgList = new List<System.Drawing.Image>();

        iTextSharp.text.pdf.RandomAccessFileOrArray RAFObj = null;
        iTextSharp.text.pdf.PdfReader PDFReaderObj = null;
        iTextSharp.text.pdf.PdfObject PDFObj = null;
        iTextSharp.text.pdf.PdfStream PDFStremObj = null;

        try
        {
            RAFObj = new iTextSharp.text.pdf.RandomAccessFileOrArray(data);
            PDFReaderObj = new iTextSharp.text.pdf.PdfReader(RAFObj, null);

            for (int i = 0; i <= PDFReaderObj.XrefSize - 1; i++)
            {
                PDFObj = PDFReaderObj.GetPdfObject(i);

                if ((PDFObj != null) && PDFObj.IsStream())
                {
                    PDFStremObj = (iTextSharp.text.pdf.PdfStream)PDFObj;
                    iTextSharp.text.pdf.PdfObject subtype = PDFStremObj.Get(iTextSharp.text.pdf.PdfName.SUBTYPE);

                    if ((subtype != null) && subtype.ToString() == iTextSharp.text.pdf.PdfName.IMAGE.ToString())
                    {
                        byte[] bytes = iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw((iTextSharp.text.pdf.PRStream)PDFStremObj);

                        if ((bytes != null))
                        {
                            try
                            {
                                System.IO.MemoryStream MS = new System.IO.MemoryStream(bytes);

                                MS.Position = 0;
                                System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream(MS);

                                ImgList.Add(ImgPDF);

                            }
                            catch (Exception)
                            {
                            }
                        }
                    }
                }
            }
            PDFReaderObj.Close();
        }
        catch (Exception ex)
        {
            throw new Exception(ex.Message);
        }



    } //Form1_Load

解决方案

I have used this library in the past with no problems. It should be exactly what you're after.

http://www.winnovative-software.com/PdfImgExtractor.aspx

这篇关于使用iTextSharp的PDF从图像中提取的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用iTextSharp的PDF从图像中提取 [英] Extract image from PDF using itextsharp

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

使用iTextSharp的PDF从图像中提取 [英] Extract image from PDF using itextsharp

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭