iTextSharp XMLWorkerHelper和HTML到PDF的图像 [英] iTextSharp XMLWorkerHelper and Images for HTML to PDF

查看:907
本文介绍了iTextSharp XMLWorkerHelper和HTML到PDF的图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

底线是我使用iTextSharp将HTML写成PDF格式 - 附带图片。现在,我在最新版本的iTextSharp 5.5.5.0。我可以访问 Bruno 的书,我使用 demo.iTextSupport.com 进行转换。不幸的是,这本书似乎没有提及 XMLWorkerHelper ,它是我用来从HTML创建PDF的。



以下是我最终成功使用格式良好的HTML字符串生成PDF的方法:

  private string createPDFFromHtml(string htmlString,string outputFileName)
{
string result = string.Empty; (!string.IsNullOrEmpty(htmlString)&&!string.IsNullOrEmpty(outputFileName)&&!File.Exists(outputFileName))

尝试
{

{
using(FileStream fos = new FileStream(outputFileName,FileMode.Create))
{
using(MemoryStream inputMemoryStream = new MemoryStream(Encoding.ASCII.GetBytes(htmlString)) )
{
using(TextReader textReader = new StreamReader(inputMemoryStream,Encoding.ASCII))
{
using(Document pdfDoc = new Document())
{
using(PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc,fos))
{
XMLWorkerHelper helper = XMLWorkerHelper.GetInstance();
pdfDoc.Open();
helper.ParseXHtml(pdfWriter,pdfDoc,textReader);
result =成功创建新的HTML - > PDF文档!;
pdfWriter.CloseStream = false;
}
}
}
}
}
}
}
catch(例外ex)
{
结果=例外:+ ex.Message;
}

返回结果;
}

这个工作,我想要做的是创建一个字母一个信头图像,图像只是一些JPG,我已经放在我的硬盘驱动器的某处。



这是我尝试过的,但它成功地图像恰好在我想要的位置以及我想要的位置,PDF的其余部分严重地截断了输出。
$ b

  private string createPDFFromHtmlWithImage(string htmlString,string outputFileName,string headerImagePath)
{
string result = string .Empty; (!string.IsNullOrEmpty(htmlString)&&!string.IsNullOrEmpty(outputFileName)&&!File.Exists(outputFileName))

尝试
{

{
using(FileStream fos = new FileStream(outputFileName,FileMode.Create))
{
using(MemoryStream inputMemoryStream = new MemoryStream(Encoding.ASCII.GetBytes(htmlString)) )
{
using(TextReader textReader = new StreamReader(inputMemoryStream,Encoding.ASCII))
{
using(Document pdfDoc = new Document())
{
using(PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc,fos))
{
pdfDoc.Open();
Image img = Image.GetInstance(headerImagePath);
if(img!= null)
{
img.ScaleToFit(540f,300f);
pdfDoc.Add(img);
}

XMLWorkerHelper helper = XMLWorkerHelper.GetInstance();
helper.ParseXHtml(pdfWriter,pdfDoc,textReader);

result =成功创建新的HTML - > PDF文档!;
pdfWriter.CloseStream = false;
}
}
}
}
}
}
}
catch(例外ex)
{
结果=例外:+ ex.Message;
}

返回结果;
}

结果是PDF具有我想要的图像,然后基本上是第一个我的HTML(但即使那个DIV没有完全显示),然后没有别的。



所以,我想我可能不需要爆炸textReader到pdfDoc ,但也许做一些添加某种。



而且...这里是我迷路了。



我想我仍然需要使用XMLWorkerHelper,但是我需要用IElementHandler做些事情,而不是把所有的东西都放到pdfWriter中。



其他研究显示我可以通过克里斯哈斯精彩的帖子在这里



因此,我制作了自己的IElementHandler,就像Chris展示的那样(除了我做的事情很长,请耐心等待):

  public class HtmlElementHandler:IElementHandler 
{
public List< IElement> elementList = new List< IElement>();

public void Add(IWritable e)
{
if(e!= null& e是WritableElement)
{
WritableElement we = e作为WritableElement;

if(we!= null)
{
IList< IElement> weList = we.Elements();
if(weList.Any())
{
elementList.AddRange(weList);
}
}
}
}
}



$ p
$ b

  private string createPDFFromHtmlWithImageElemental(string htmlString,string outputFileName,string headerImagePath)
{
string result = string.Empty; (!string.IsNullOrEmpty(htmlString)&&!string.IsNullOrEmpty(outputFileName)&&!File.Exists(outputFileName))

尝试
{

{
using(FileStream fos = new FileStream(outputFileName,FileMode.Create))
{
using(MemoryStream inputMemoryStream = new MemoryStream(Encoding.ASCII.GetBytes(htmlString)) )
{
using(TextReader textReader = new StreamReader(inputMemoryStream,Encoding.ASCII))
{
using(Document pdfDoc = new Document())
{
using(PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc,fos))
{
pdfDoc.Open();
Image img = Image.GetInstance(headerImagePath);
if(img!= null)
{
img.ScaleToFit(540f,300f);
pdfDoc.Add(img);
}

HtmlElementHandler htmlElementHandler = new HtmlElementHandler();

XMLWorkerHelper helper = XMLWorkerHelper.GetInstance();
helper.ParseXHtml(htmlElementHandler,inputMemoryStream,Encoding.ASCII);

foreach(htmlElementHandler.elementList中的IElement元素)
{
pdfDoc.Add(ielement);
}

result =成功创建新的HTML - > PDF文档!;
pdfWriter.CloseStream = false;
}
}
}
}
}
}
}
catch(例外ex)
{
结果=例外:+ ex.Message;
}

返回结果;
}

我得到的结果完全相同,只是将整个东西放入pdfDoc中之前。



我可以看到我的元素实际上是一个带有内容的iTextShartp.text.pdf.PdfDiv,也许我可以用它做点什么,但我真的不是很多专家都在这里,我觉得我会在没有Alice的情况下引导我进入兔子洞。



其他搜索表明有一种方法可以嵌入图像,但我并不都热衷于生成二进制文件,我的图像的文本图像字符串,并像这个解决方案一样将它加载到HTML中。我希望能够根据需要选择和更改图像。我想我可以创建一种方法来拍摄图像,创建这种二进制文本,并将其插入到我的HTML中,但我宁愿先看看是否有另一种解决方案。

所以,你可以看到我所尝试过的。本书中没有提到XML Worker,因为这本书是编写于2009年,关于XML Worker的开发工作从2011年开始。您的问题很长,但缺少一个重要元素:一个HTML样本,例如为沙箱示例(你没有提到)。例如:当使用 thoreau.html 示例时http://itext.pdf.com/sandbox/xmlworker/D08_ParseHtmlImagesLinksOopsrel =nofollow> ParseHtmlImagesLinksOops ,我们会丢失所有图片: thoreau_oops.pdf ;当我们使用 ParseHtmlImagesLinks 时,我们使用 ImageProvider 这确保我们得到正确的路径图像和结果看起来相当不错: thoreau。 pdf (顺便说一下,链接也是如此)。

然而,当我看到实际的需求时,我发现你想要用信笺制作一张带有图像的信件。在这种情况下,我会使用页面事件将公司固定添加到每个页面。如何做到这一点在书中解释。


Bottom line is I'm using iTextSharp to write out HTML to a PDF -- with an image. Right now, I'm at the latest version of iTextSharp which is 5.5.5.0. I have access to Bruno's book, and I'm using the methodology spelled out by demo.iTextSupport.com for the conversion. Unfortunately, the book doesn't appear to have any reference to XMLWorkerHelper, which is what I'm using to create the PDF from the HTML.

Here's the method I finally got working that successfully generates a PDF from a well-formed HTML string:

private string createPDFFromHtml(string htmlString, string outputFileName)
{
    string result = string.Empty;

    try
    {
        if (!string.IsNullOrEmpty(htmlString) && !string.IsNullOrEmpty(outputFileName) && !File.Exists(outputFileName))
        {
            using (FileStream fos = new FileStream(outputFileName, FileMode.Create))
            {
                using (MemoryStream inputMemoryStream = new MemoryStream(Encoding.ASCII.GetBytes(htmlString)))
                {
                    using (TextReader textReader = new StreamReader(inputMemoryStream, Encoding.ASCII))
                    {
                        using (Document pdfDoc = new Document())
                        {
                            using (PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc, fos))
                            {
                                XMLWorkerHelper helper = XMLWorkerHelper.GetInstance();
                                pdfDoc.Open();
                                helper.ParseXHtml(pdfWriter, pdfDoc, textReader);
                                result = "Successfully Created new HTML--> PDF Document!";
                                pdfWriter.CloseStream = false;
                            }
                        }
                    }
                }
            }
        }
    }
    catch (Exception ex)
    {
        result = "Exception: " + ex.Message;
    }

    return result;
}

This works, and what I'd like to do is create a letter with an image for letterhead, and the image is just some JPG that I have laying around on my hard drive somewhere.

Here's what I've tried, but while it successfully plops the image exactly where I want and how I want, the rest of the PDF has severely truncated output.

 private string createPDFFromHtmlWithImage(string htmlString, string outputFileName, string headerImagePath)
        {
            string result = string.Empty;

            try
            {
                if (!string.IsNullOrEmpty(htmlString) && !string.IsNullOrEmpty(outputFileName) && !File.Exists(outputFileName))
                {
                    using (FileStream fos = new FileStream(outputFileName, FileMode.Create))
                    {
                        using (MemoryStream inputMemoryStream = new MemoryStream(Encoding.ASCII.GetBytes(htmlString)))
                        {
                            using (TextReader textReader = new StreamReader(inputMemoryStream, Encoding.ASCII))
                            {
                                using (Document pdfDoc = new Document())
                                {
                                    using (PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc, fos))
                                    {
                                        pdfDoc.Open();
                                        Image img = Image.GetInstance(headerImagePath);
                                        if (img != null)
                                        {
                                            img.ScaleToFit(540f, 300f);
                                            pdfDoc.Add(img);
                                        }

                                        XMLWorkerHelper helper = XMLWorkerHelper.GetInstance();
                                        helper.ParseXHtml(pdfWriter, pdfDoc, textReader);

                                        result = "Successfully Created new HTML--> PDF Document!";
                                        pdfWriter.CloseStream = false;
                                    }
                                }
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                result = "Exception: " + ex.Message;
            }

            return result;
        }

The results are that the PDF has the image I want and then basically the first of my HTML (but even that DIV isn't completely shown), then nothing else.

So, I figured I needed to probably not just blast the textReader into the pdfDoc, but maybe do some "adds" of some sort.

And...here's where I'm getting lost.

I'm thinking I still need to use the XMLWorkerHelper, but I need to do something with IElementHandler rather than just shoving the whole thing into a pdfWriter.

Additional research shows that I can possibly do some tricks with IElements via Chris Haas wonderful post here.

So, I make my own IElementHandler like Chris shows (except I do things the long way, please bear with me):

public class HtmlElementHandler : IElementHandler
{
    public List<IElement> elementList = new List<IElement>();

    public void Add(IWritable e)
    {
        if (e != null && e is WritableElement)
        {
            WritableElement we = e as WritableElement;

            if (we != null)
            {
                IList<IElement> weList = we.Elements();
                if (weList.Any())
                {
                    elementList.AddRange(weList);
                }
            }
        }
    }
}

Now using this code:

 private string createPDFFromHtmlWithImageElemental(string htmlString, string outputFileName, string headerImagePath)
        {
            string result = string.Empty;

            try
            {
                if (!string.IsNullOrEmpty(htmlString) && !string.IsNullOrEmpty(outputFileName) && !File.Exists(outputFileName))
                {
                    using (FileStream fos = new FileStream(outputFileName, FileMode.Create))
                    {
                        using (MemoryStream inputMemoryStream = new MemoryStream(Encoding.ASCII.GetBytes(htmlString)))
                        {
                            using (TextReader textReader = new StreamReader(inputMemoryStream, Encoding.ASCII))
                            {
                                using (Document pdfDoc = new Document())
                                {
                                    using (PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc, fos))
                                    {
                                        pdfDoc.Open();
                                        Image img = Image.GetInstance(headerImagePath);
                                        if (img != null)
                                        {
                                            img.ScaleToFit(540f, 300f);
                                            pdfDoc.Add(img);
                                        }

                                        HtmlElementHandler htmlElementHandler = new HtmlElementHandler();

                                        XMLWorkerHelper helper = XMLWorkerHelper.GetInstance();
                                        helper.ParseXHtml(htmlElementHandler, inputMemoryStream, Encoding.ASCII);

                                        foreach (IElement ielement in htmlElementHandler.elementList)
                                        {
                                            pdfDoc.Add(ielement);
                                        }

                                        result = "Successfully Created new HTML--> PDF Document!";
                                        pdfWriter.CloseStream = false;
                                    }
                                }
                            }
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                result = "Exception: " + ex.Message;
            }

            return result;
        }

I get the same exact results as just plopping the whole thing into the pdfDoc like before.

I can see that my element is actually a iTextShartp.text.pdf.PdfDiv with content, maybe I could do something with that, but I'm really not much of an expert here and I feel like I'm going down the rabbit hole without Alice to guide me.

Additional searching indicates there is a way to get an image embedded, but I'm not all that keen on generating the binary-as-text image string for my image and loading it into the HTML like this solution does. I'd like to be able to choose and change images as needed. I guess I could create a way to take an image, create this binary-text, and insert it into my HTML, but I'd rather see if there is another solution first.

So, you can see what I've tried. I'd appreciate any other help you can provide.

解决方案

XML Worker isn't mentioned in the book, because the book was written in 2009 and the development on XML Worker started somewhere in 2011. Your question is very long, yet it is missing an important element: an HTML sample like the one's provided for the sandbox examples (which you don't mention). For instance: when the parse the thoreau.html example using ParseHtmlImagesLinksOops, we lose all images: thoreau_oops.pdf; when we use ParseHtmlImagesLinks, we use an ImageProvider that makes sure we get the correct paths to the images and the result looks quite OK: thoreau.pdf (so do the links, by the way).

However, when I look at the actual requirement, I see that you want to create a letter with an image for letterhead. In that case, I would use page events to add company stationary to each page. How to do that is explained in the book.

这篇关于iTextSharp XMLWorkerHelper和HTML到PDF的图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆