iTextSharp的HTMLWorker ParseHTML TABLESTYLE和PDFStamper [英] iTextSharp HTMLWorker ParseHTML Tablestyle and PDFStamper

查看:147
本文介绍了iTextSharp的HTMLWorker ParseHTML TABLESTYLE和PDFStamper的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好我已成功使用了HTMLWorker转换使用asp.NET / C#的GridView。

(1)我已经申请了一些有限的款式结果表但看不到如何申请TABLESTYLE例如网格线或应用其他格式的风格,如大柱宽度例如特定列。
(2)我真的希望把这段文字上,其中包含我以前用过的PDF加工厂此,但不能看我怎么可以同时使用PDFStamper和HTMLWorker一次标志等一个pre-现有模板。 HTMLWorker需要它实现iDocListener文档...但是,这似乎不与usign一个PDFStamper兼容。我猜我期待的就是创建一个PDFStamper,写标题等,然后从网格中添加了解析的HTML的方式。另一问题是,被分析的内容犯规与页面上的其他的东西进行交互。比如我下面添加标题块到页面上。而不是在开始其下,将解析HTML写在上面。我如何/交互解析的HTML内容与什么是对PDF文档的休息吗?

在此先感谢
罗布

下面'; S的code我已经

 文档pdfDoc =新的文件(PageSize.A4,10F,10F,30F,0F);            HTMLWorker htmlWorker =新HTMLWorker(pdfDoc);            样式表样式=新的样式表();
            styles.LoadTagStyle(日,大小,12像素);
            styles.LoadTagStyle(日,面子,黑体);
            styles.LoadTagStyle(跨越,大小,10px的);
            styles.LoadTagStyle(跨越,面子,黑体);
            styles.LoadTagStyle(TD,大小,10px的);
            styles.LoadTagStyle(TD,面子,黑体);            htmlWorker.SetStyleSheet(样式);            PdfWriter.GetInstance(pdfDoc,HttpContext.Current.Response.OutputStream);            pdfDoc.Open();            //标题 - 但得到由数据obsured,那并不下移
            字体的字体=新Font(Font.FontFamily.HELVETICA,14,Font.BOLD);
            大块大块=新的块(标题,字体);
            pdfDoc.Add(块);
            //身体
            htmlWorker.Parse(SR);


解决方案

首先,让我给你几个环节过目,当你得到一个机会:


  1. iTextSharp的HTML和CSS的支持

  2. 如何在使用时通过HTML到PDF应用字体属性iTextSharp的

这些答案去钻研这是怎么回事,我建议你阅读他们,当你得到一个机会。特别是第二个会告诉你为什么你需要使用 PT 而不是像素

要回答你的第一个问题,让我告诉你使用 HTMLWorker 类以不同的方式。这个类有一个静态方法就可以了名为 ParseToList 将HTML转换为列表< IElement> 。在该列表中的对象是你的HTML的所有iTextSharp的特定版本。通常你会用一个的foreach 这些,只是将它们添加到文档,但你的添加这是你想要做什么之前,的修改。下面是code,需要一个静态的字符串,这是否:

 字符串文件1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),File1.pdf);使用(的FileStream FS =新的FileStream(文件1,FileMode.Create,FileAccess.Write,FileShare.None))
{
    使用(DOC文档=新的文件(PageSize.LETTER))
    {
        使用(PdfWriter作家= PdfWriter.GetInstance(DOC,FS))
        {
            doc.Open();
            //我们的HTML
            字符串的html =&LT;表&gt;&LT; TR&GT;百分位&gt;首先名称和LT; /第i&LT;第i个尾Name</th></tr><tr><td>Chris</td><td>Haas</td></tr></table>\";
            // ParseToList需要一个StreamReader,而不是只是一个字符串所以只是把它包
            使用(StringReader SR =新StringReader(HTML))
            {
                //创建一个样式表
                样式表样式=新的样式表();
                //...styles为简洁起见省略                //我们转换为HTML元素iTextSharp的
                清单&LT; IElement&GT;元素= iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(SR,样式);
                //循环每一个元素(在这种情况下,居然还有只有一个PdfPTable)
                的foreach(IElement EL中的元素)
                {
                    //如果元件是PdfPTable
                    如果(EL是PdfPTable)
                    {
                        //把它
                        PdfPTable TT =(PdfPTable)埃尔;
                        //改变宽度,这些都是对了相对宽度
                        tt.SetWidths(新浮法[] {75,25});
                    }
                    //元素添加到文档
                    doc.Add(EL);
                }
            }
            doc.Close();
        }
    }
}

希望你可以看到,一旦你获得了原始的 PdfPTable 您可以根据需要调整它。

要回答你的第二个问题,如果你想使用正常的段落与<$对象C $ C> PdfStamper ,那么你需要使用 PdfContentByte 对象。您可以通过以下两种方式之一,无论是要求一个坐在上面现有内容, stamper.GetOverContent(INT)或一个坐在从加工厂获得此下面的现有内容, stamper.GetUnderContent(INT)。两个版本都需要一个单一的参数说什么网页的工作。一旦你有一个 PdfContentByte 您可以创建绑定了一个 Col​​umnText 对象,并使用该对象的的addElement()方法添加您的正常元素。这样做(这回答了你的第三个问题)之前,你要创建至少一个列。当我做这个,我通常会产生一个基本上涵盖了整个页面。 (这部分可能听起来不可思议,但我们基本上是做一个单行,单列的表格单元格添加我们的对象。)

下面是一个完整的工作C#2010 WinForms应用程序针对iTextSharp的5.1.1.0,显示高于一切了。首先,它在桌面上创建一个通用的PDF。然后创建基于该第一个第二个文件,增加了一个段落,然后一些HTML。请参见code中的意见的任何问题。

 使用系统;
使用System.Collections.Generic;
使用System.Text;
使用System.Windows.Forms的;
使用iTextSharp.text;
使用iTextSharp.text.html.simpleparser;
使用iTextSharp.text.pdf;
使用System.IO;
命名空间WindowsFormsApplication1
{
    公共部分Form1类:表格
    {
        公共Form1中()
        {
            的InitializeComponent();
        }        私人无效Form1_Load的(对象发件人,EventArgs的发送)
        {
            //这两个文件,​​我们正在创建
            字符串文件1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),File1.pdf);
            字符串的文件2 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),File2.pdf);            //创建一个基本文件就上书写
            使用(的FileStream FS =新的FileStream(文件1,FileMode.Create,FileAccess.Write,FileShare.None))
            {
                使用(DOC文档=新的文件(PageSize.LETTER))
                {
                    使用(PdfWriter作家= PdfWriter.GetInstance(DOC,FS))
                    {
                        doc.Open();
                        doc.Add(新段(世界,你好));
                        doc.Close();
                    }
                }
            }            //绑定一个读者给我们的第一文档
            PdfReader读卡器=新PdfReader(文件1);            //创建我们的第二份文件
            使用(的FileStream FS =新的FileStream(文件2,FileMode.Create,FileAccess.Write,FileShare.None))
            {
                使用(PdfStamper压模=新PdfStamper(读卡器,FS))
                {
                    样式表样式=新的样式表();
                    //...styles为简洁起见省略                    //我们的HTML
                    字符串的html =&LT;表&gt;&LT; TR&GT;百分位&gt;首先名称和LT; /第i&LT;第i个尾Name</th></tr><tr><td>Chris</td><td>Haas</td></tr></table>\";
                    // ParseToList需要一个StreamReader,而不是只是一个字符串所以只是把它包
                    使用(StringReader SR =新StringReader(HTML))
                    {
                        //获取我们的原料PdfContentByte对象让我们画上面现有内容
                        PdfContentByte CB = stamper.GetOverContent(1);
                        //创建绑定到上述PdfContentByte对象的新ColumnText对象
                        ColumnText克拉=新ColumnText(CB);
                        //获取我们的源文档的第一页的尺寸
                        iTextSharp.text.Rectangle page1size = reader.GetPageSize(1);
                        //创建一个列的对象覆盖了整个页面
                        ct.SetSimpleColumn(0,0,page1size.Width,page1size.Height);                        ct.AddElement(新段(世界,你好!));                        //我们转换为HTML元素iTextSharp的
                        清单&LT; IElement&GT;元素= iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(SR,样式);
                        //循环每一个元素(在这种情况下,居然还有只有一个PdfPTable)
                        的foreach(IElement EL中的元素)
                        {
                            //如果元件是PdfPTable
                            如果(EL是PdfPTable)
                            {
                                //把它
                                PdfPTable TT =(PdfPTable)埃尔;
                                //改变宽度,这些都是对了相对宽度
                                tt.SetWidths(新浮法[] {75,25});
                            }
                            //元素添加到ColumnText
                            ct.AddElement(EL);
                        }
                        //重要的是,这其实犯了对象到PDF
                        ct.Go();
                    }
                }
            }            this.Close();
        }
    }
}

Hi I have succesfully used a HTMLWorker to convert a gridview using asp.NET / C#.

(1) I have applied some limited style to the resulting table but cannot see how to apply tablestyle for instance grid lines or apply other formatting style such as a large column width for example for a particular column. (2) I would actually like to put this text onto a pre-existing template which contains a logo etc. I've used PDF Stamper before for this but cannot see how I can use both PDFStamper and HTMLWorker at once. HTMLWorker needs a Document which implements iDocListener ... but that doesnt seem compatible with usign a PDFStamper. I guess what I am looking for is a way to create a PDFStamper, write title etc, then add the parsed HTML from the grid. The other problem is that the parsed content doesnt interact with the other stuff on the page. For instance below I add a title chunk to the page. Rather than starting below it, the parsed HTML writes over the top. How do I place / interact the parsed HTML content with the rest of what is on the PDF document ?

Thanks in advance Rob

Here';s the code I have already

            Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 30f, 0f);

            HTMLWorker htmlWorker = new HTMLWorker(pdfDoc);

            StyleSheet styles = new StyleSheet();
            styles.LoadTagStyle("th", "size", "12px");
            styles.LoadTagStyle("th", "face", "helvetica");
            styles.LoadTagStyle("span", "size", "10px");
            styles.LoadTagStyle("span", "face", "helvetica");                
            styles.LoadTagStyle("td", "size", "10px");
            styles.LoadTagStyle("td", "face", "helvetica");     

            htmlWorker.SetStyleSheet(styles);

            PdfWriter.GetInstance(pdfDoc, HttpContext.Current.Response.OutputStream);

            pdfDoc.Open();

            //Title - but this gets obsured by data, doesnt move it down
            Font font = new Font(Font.FontFamily.HELVETICA, 14, Font.BOLD);
            Chunk chunk = new Chunk(title, font);                
            pdfDoc.Add(chunk);


            //Body
            htmlWorker.Parse(sr);

解决方案

Let me first give you a couple of links to look over when you get a chance:

  1. ItextSharp support for HTML and CSS
  2. How to apply font properties on while passing html to pdf using itextsharp

These answers go deeper into what's going on and I recommend reading them when you get a chance. Specifically the second one will show you why you need to use pt instead of px.

To answer your first question let me show you a different way to use the HTMLWorker class. This class has a static method on it called ParseToList that will convert HTML to a List<IElement>. The objects in that list are all iTextSharp specific versions of your HTML. Normally you would do a foreach on those and just add them to a document but you can modify them before adding which is what you want to do. Below is code that takes a static string and does that:

string file1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File1.pdf");

using (FileStream fs = new FileStream(file1, FileMode.Create, FileAccess.Write, FileShare.None))
{
    using (Document doc = new Document(PageSize.LETTER))
    {
        using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
        {
            doc.Open();
            //Our HTML
            string html = "<table><tr><th>First Name</th><th>Last Name</th></tr><tr><td>Chris</td><td>Haas</td></tr></table>";
            //ParseToList requires a StreamReader instead of just a string so just wrap it
            using (StringReader sr = new StringReader(html))
            {
                //Create a style sheet
                StyleSheet styles = new StyleSheet();
                //...styles omitted for brevity

                //Convert our HTML to iTextSharp elements
                List<IElement> elements = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(sr, styles);
                //Loop through each element (in this case there's actually just one PdfPTable)
                foreach (IElement el in elements)
                {
                    //If the element is a PdfPTable
                    if (el is PdfPTable)
                    {
                        //Cast it
                        PdfPTable tt = (PdfPTable)el;
                        //Change the widths, these are relative width by the way
                        tt.SetWidths(new float[] { 75, 25 });
                    }
                    //Add the element to the document
                    doc.Add(el);
                }
            }
            doc.Close();
        }
    }
}

Hopefully you can see that once you get access to the raw PdfPTable you can tweak it as necessary.

To answer your second question, if you want to use the normal Paragraph and Chunk objects with a PdfStamper then you need to use a PdfContentByte object. You can get this from your stamper in one of two ways, either by asking for one that sits "above" existing content, stamper.GetOverContent(int) or one that sits "below" existing content, stamper.GetUnderContent(int). Both versions take a single parameter saying what page to work with. Once you have a PdfContentByte you can create a ColumnText object bound to it and use this object's AddElement() method to add your normal elements. Before doing this (and this answers your third question), you'll want to create at least one "column". When I do this I generally create one that essentially covers the entire page. (This part might sound weird but we're essentially make a single row, single column table cell to add our objects to.)

Below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.1.0 that shows off everything above. First it creates a generic PDF on the desktop. Then it creates a second document based off of the first, adds a paragraph and then some HTML. See the comments in the code for any questions.

using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.html.simpleparser;
using iTextSharp.text.pdf;
using System.IO;


namespace WindowsFormsApplication1
{
    public partial class Form1 : Form
    {
        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            //The two files that we are creating
            string file1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File1.pdf");
            string file2 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File2.pdf");

            //Create a base file to write on top of
            using (FileStream fs = new FileStream(file1, FileMode.Create, FileAccess.Write, FileShare.None))
            {
                using (Document doc = new Document(PageSize.LETTER))
                {
                    using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
                    {
                        doc.Open();
                        doc.Add(new Paragraph("Hello world"));
                        doc.Close();
                    }
                }
            }

            //Bind a reader to our first document
            PdfReader reader = new PdfReader(file1);

            //Create our second document
            using (FileStream fs = new FileStream(file2, FileMode.Create, FileAccess.Write, FileShare.None))
            {
                using (PdfStamper stamper = new PdfStamper(reader, fs))
                {
                    StyleSheet styles = new StyleSheet();
                    //...styles omitted for brevity

                    //Our HTML
                    string html = "<table><tr><th>First Name</th><th>Last Name</th></tr><tr><td>Chris</td><td>Haas</td></tr></table>";
                    //ParseToList requires a StreamReader instead of just a string so just wrap it
                    using (StringReader sr = new StringReader(html))
                    {
                        //Get our raw PdfContentByte object letting us draw "above" existing content
                        PdfContentByte cb = stamper.GetOverContent(1);
                        //Create a new ColumnText object bound to the above PdfContentByte object
                        ColumnText ct = new ColumnText(cb);
                        //Get the dimensions of the first page of our source document
                        iTextSharp.text.Rectangle page1size = reader.GetPageSize(1);
                        //Create a single column object spanning the entire page
                        ct.SetSimpleColumn(0, 0, page1size.Width, page1size.Height);

                        ct.AddElement(new Paragraph("Hello world!"));

                        //Convert our HTML to iTextSharp elements
                        List<IElement> elements = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(sr, styles);
                        //Loop through each element (in this case there's actually just one PdfPTable)
                        foreach (IElement el in elements)
                        {
                            //If the element is a PdfPTable
                            if (el is PdfPTable)
                            {
                                //Cast it
                                PdfPTable tt = (PdfPTable)el;
                                //Change the widths, these are relative width by the way
                                tt.SetWidths(new float[] { 75, 25 });
                            }
                            //Add the element to the ColumnText
                            ct.AddElement(el);
                        }
                        //IMPORTANT, this actually commits our object to the PDF
                        ct.Go();
                    }
                }
            }

            this.Close();
        }
    }
}

这篇关于iTextSharp的HTMLWorker ParseHTML TABLESTYLE和PDFStamper的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆