需要使用iTextSharp的从HTML创建PDF帮助 [英] Need help with creating PDF from HTML using itextsharp
问题描述
我想箱子PDF文件了HTML页面。我使用的CMS是EPiServer。
I'm trying to crate a PDF out of a HTML page. The CMS I'm using is EPiServer.
这是我的code到目前为止:
This is my code so far:
protected void Button1_Click(object sender, EventArgs e)
{
naaflib.pdfDocument(CurrentPage);
}
public static void pdfDocument(PageData pd)
{
//Extract data from Page (pd).
string intro = pd["MainIntro"].ToString(); // Attribute
string mainBody = pd["MainBody"].ToString(); // Attribute
// makae ready HttpContext
HttpContext.Current.Response.Clear();
HttpContext.Current.Response.ContentType = "application/pdf";
// Create PDF document
Document pdfDocument = new Document(PageSize.A4, 80, 50, 30, 65);
//PdfWriter pw = PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
PdfWriter.GetInstance(pdfDocument, HttpContext.Current.Response.OutputStream);
pdfDocument.Open();
pdfDocument.Add(new Paragraph(pd.PageName));
pdfDocument.Add(new Paragraph(intro));
pdfDocument.Add(new Paragraph(mainBody));
pdfDocument.Close();
HttpContext.Current.Response.End();
}
此输出的商品名称,介绍文本和主体的内容。 但它没有收杆的HTML这是在本文的文本,并且没有布局
This outputs the content of the article name, intro-text and main body. But it does not pars HTML which is in the article text and there is no layout.
我试着在看看 http://itextsharp.sourceforge.net/tutorial/ index.html的没有becomming任何明智的。
I've tried having a look at http://itextsharp.sourceforge.net/tutorial/index.html without becomming any wiser.
任何指针朝着正确的方向大大AP preciated:)
Any pointers to the right direction is greatly appreciated :)
推荐答案
更高版本的iTextSharp的的:
使用iTextSharp的,你可以使用 iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList()
方法来创建HTML PDF文件。
Using iTextSharp you can use the iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList()
method to create a PDF from HTML.
ParseToList()
需要的TextReader
(抽象类),它的HTML源代码,这意味着你可以使用一个 StringReader
或的StreamReader
(两者都使用的TextReader作为一种基本类型)。我用了一个 StringReader
,并能够生成简单的标记了PDF文件。我试图使用HTML的网页返回,并得到了错误的所有,但simplist页面。即使是simplist网页我检索( http://black.ea.com/ )被渲染的内容页面的头部标签到PDF,所以我觉得 HTMLWorker.ParseToList()
方法是挑剔它解析HTML的格式。
ParseToList()
takes a TextReader
(an abstract class) for its HTML source, which means you can use a StringReader
or StreamReader
(both of which use TextReader as a base type). I used a StringReader
and was able to generate PDFs from simple mark up. I tried to use the HTML returned from a webpage and got errors on all but the simplist pages. Even the simplist webpage I retrieved (http://black.ea.com/) was rendering the content of the page's 'head' tag onto the PDF, so I think the HTMLWorker.ParseToList()
method is picky about the formatting of the HTML it parses.
无论如何,如果你想尝试这里的测试code我用:
Anyway, if you want to try here's the test code I used:
// Download content from a very, very simple "Hello World" web page.
string download = new WebClient().DownloadString("http://black.ea.com/");
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
PdfWriter.GetInstance(document, fs);
using (StringReader stringReader = new StringReader(download)) {
ArrayList parsedList = HTMLWorker.ParseToList(stringReader, null);
document.Open();
foreach (object item in parsedList) {
document.Add((IElement)item);
}
document.Close();
}
}
} catch (Exception exc) {
Console.Error.WriteLine(exc.Message);
}
我找不到任何文件上的HTML构造 HTMLWorker.ParseToList()
支持;如果你请张贴在这里。我敢肯定有很多人会感兴趣。
I couldn't find any documentation on which HTML constructs HTMLWorker.ParseToList()
supports; if you do please post it here. I'm sure a lot of people would be interested.
对于旧版本的iTextSharp的的:
您可以使用 iTextSharp.text.html.HtmlParser.Parse
方法基于HTML创建一个PDF文件。
For older versions of iTextSharp:
You can use the iTextSharp.text.html.HtmlParser.Parse
method to create a PDF based on html.
下面是一个片段展示了这一点:
Here's a snippet demonstrating this:
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
PdfWriter.GetInstance(document, fs);
HtmlParser.Parse(document, "YourHtmlDocument.html");
}
} catch(Exception exc) {
Console.Error.WriteLine(exc.Message);
}
在一(主要对我来说)问题是HTML必须严格符合XHTML标准。
The one (major for me) problem is the HTML must be strictly XHTML compliant.
祝你好运!
这篇关于需要使用iTextSharp的从HTML创建PDF帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!