使用iTextSharp的PDF从图像中提取 [英] Extract image from PDF using itextsharp
问题描述
我试图提取所有使用iTextSharp的pdf文件图像,但似乎无法克服这一障碍。
错误就行了为System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream(MS)occures;
给人一个错误参数无效
我觉得它工作时,图像是一个位图,但不是任何其他格式。
我有这个以下code - 遗憾的长度;
私人无效Form1_Load的(对象发件人,EventArgs的发送)
{
的FileStream FS = File.OpenRead(@reader.pdf);
字节[]数据=新的字节[fs.Length]
fs.Read(数据,0,(int)的fs.Length); 清单<&System.Drawing.Image对象GT; ImgList =新的List<&System.Drawing.Image对象GT;(); iTextSharp.text.pdf.RandomAccessFileOrArray RAFObj = NULL;
iTextSharp.text.pdf.PdfReader PDFReaderObj = NULL;
iTextSharp.text.pdf.PdfObject PDFObj = NULL;
iTextSharp.text.pdf.PdfStream PDFStremObj = NULL; 尝试
{
RAFObj =新iTextSharp.text.pdf.RandomAccessFileOrArray(数据);
PDFReaderObj =新iTextSharp.text.pdf.PdfReader(RAFObj,NULL); 的for(int i = 0; I< = PDFReaderObj.XrefSize - 1;我++)
{
PDFObj = PDFReaderObj.GetPdfObject(ⅰ); 如果((PDFObj = NULL)及!&安培; PDFObj.IsStream())
{
PDFStremObj =(iTextSharp.text.pdf.PdfStream)PDFObj;
iTextSharp.text.pdf.PdfObject亚型= PDFStremObj.Get(iTextSharp.text.pdf.PdfName.SUBTYPE); 如果((亚型=空)及!&放大器; subtype.ToString()== iTextSharp.text.pdf.PdfName.IMAGE.ToString())
{
字节[]字节= iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw((iTextSharp.text.pdf.PRStream)PDFStremObj); 如果((字节!= NULL))
{
尝试
{
System.IO.MemoryStream MS =新System.IO.MemoryStream(字节); MS.Position = 0;
为System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream(MS); ImgList.Add(ImgPDF); }
赶上(例外)
{
}
}
}
}
}
PDFReaderObj.Close();
}
赶上(异常前)
{
抛出新的异常(ex.Message);
} } // Form1_Load的
我已经使用这个库在过去,没有任何问题。它应该是正是你追求的。
http://www.winnovative-software.com/PdfImgExtractor.aspx
I am trying to extract all the images from a pdf using itextsharp but can't seem to overcome this one hurdle.
The error occures on the line System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream(MS);
giving an error of "Parameter is not valid".
I think it works when the image is a bitmap but not of any other format.
I have this following code - sorry for the length;
private void Form1_Load(object sender, EventArgs e)
{
FileStream fs = File.OpenRead(@"reader.pdf");
byte[] data = new byte[fs.Length];
fs.Read(data, 0, (int)fs.Length);
List<System.Drawing.Image> ImgList = new List<System.Drawing.Image>();
iTextSharp.text.pdf.RandomAccessFileOrArray RAFObj = null;
iTextSharp.text.pdf.PdfReader PDFReaderObj = null;
iTextSharp.text.pdf.PdfObject PDFObj = null;
iTextSharp.text.pdf.PdfStream PDFStremObj = null;
try
{
RAFObj = new iTextSharp.text.pdf.RandomAccessFileOrArray(data);
PDFReaderObj = new iTextSharp.text.pdf.PdfReader(RAFObj, null);
for (int i = 0; i <= PDFReaderObj.XrefSize - 1; i++)
{
PDFObj = PDFReaderObj.GetPdfObject(i);
if ((PDFObj != null) && PDFObj.IsStream())
{
PDFStremObj = (iTextSharp.text.pdf.PdfStream)PDFObj;
iTextSharp.text.pdf.PdfObject subtype = PDFStremObj.Get(iTextSharp.text.pdf.PdfName.SUBTYPE);
if ((subtype != null) && subtype.ToString() == iTextSharp.text.pdf.PdfName.IMAGE.ToString())
{
byte[] bytes = iTextSharp.text.pdf.PdfReader.GetStreamBytesRaw((iTextSharp.text.pdf.PRStream)PDFStremObj);
if ((bytes != null))
{
try
{
System.IO.MemoryStream MS = new System.IO.MemoryStream(bytes);
MS.Position = 0;
System.Drawing.Image ImgPDF = System.Drawing.Image.FromStream(MS);
ImgList.Add(ImgPDF);
}
catch (Exception)
{
}
}
}
}
}
PDFReaderObj.Close();
}
catch (Exception ex)
{
throw new Exception(ex.Message);
}
} //Form1_Load
I have used this library in the past with no problems. It should be exactly what you're after.
http://www.winnovative-software.com/PdfImgExtractor.aspx
这篇关于使用iTextSharp的PDF从图像中提取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!