HTML到PDF的转换,将占位符替换为文本字段 [英] HTML to PDF conversion, replace placeholder with Text Field

查看:201
本文介绍了HTML到PDF的转换,将占位符替换为文本字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在网上搜索,试图找到一种方法来完成在HTML到PDF转换期间替换文本的任务.我要尝试完成的工作是创建将被解析为PDF的HTML文件.但是....我需要在某个时候将文本字段合并到此转换中.解析完成后,我一直在尝试添加它们,但是您需要精确的坐标来放置文本字段,而且我需要它比以前更具动态性.

我正在使用iTextSharp尝试完成此操作.

例如,假设我有一个用于1x1表(单元格)的简短HTML代码段,并且在单元格中有占位符.

 <   html  > ; 
<  正文 > 
<   > 
<   tr  > 
<   td  > 
<   p  > %SomePlaceholderText% <  /p  > 
<  /td  > 
<  /tr  > 
<  /table  > 
<  /body  > 
<  /html  >  




问题:
是否有可能在转换发生之前(或确实在转换期间或之后)让我遍历将要写入pdf的文本,寻找占位符,然后将其替换为插入文本字段的方法?我已经尝试过使用TextExtractionStrategy,但是我对c#的使用不是很熟练,感觉好像超出了我的经验水平.我要做的就是用文本字段动态替换占位符.使用iTextSharp完全可以解决这个问题,还是我应该尝试另一个库?就像搜寻的日子一样.我仍在将其适应c#的过程中,目前正在做一些事情,但是一旦将其转换为c#,一旦在下面完成,我将在此处发布指向源代码的链接. VB.Net链接.

本质上,此解决方案所做的是解析现有的pdf,在其中搜索特定的文本值.如果提取程序遇到匹配的文本,它将在其周围绘制一个粉红色矩形.但是,它对提取的文本所做的实际操作是完全可自定义的.我有它在文本周围绘制一个矩形,并在其中放置一个文本字段.我将包括需要在VB中交换的部分以执行此操作,因此,如果有人看到此内容,他们将获得有关如何使提取的文本适应您的需求的一般思路.

VB链接
http://stackoverflow. com/questions/6523243/how-to-highlight-a-text-or-word-in-a-pdf-file-using-itextsharp [ ' MatchesFound包含所有带有位置的文本,因此请随便使用它,突出显示他们使用PINK颜色: Dim AccountFields = 1 Dim MeterFields = 1 对于 每个 rect As iTextSharp.text.矩形中找到MatchesFound cb.Rectangle(rect.Left,rect.Bottom,rect.Width,rect.Height + 2 ) Dim 字段 As 新建 TextField(stamper.作家,新建 iTextSharp.text.Rectangle(rect.Left,rect.Bottom,rect.Right,rect.Top + 2 )," & AccountFields) 昏暗 form = stamper.AcroFields 昏暗 fieldKeys = form.Fields.Keys stamper.AddAnnotation(field.GetTextField(),页面) AccountFields + = 1




评论说它将填充矩形,但我只是将其保留在此处,因此您可以将其追溯到下载的解决方案中的原始位置.显然,这不是我提供的代码片段中的功能.具有iTextsharp背景的任何人都可以将其拼凑在一起,但是我想我也要清楚地说明这一点.

*编辑*附带说明一下,该程序中ContentByte的默认写入模式为GetUnderContent,因此,如果您要删除其上方的内容,只需将其切换为GetOverContent,您就会大开眼界.
干杯!


I have been scouring the web trying to find a method to accomplish the task of replacing text during an HTML to PDF conversion. What exactly I am trying to accomplish is to create HTML files that will be parsed to PDF. But....I need to incorporate Text Fields into this conversion at some point. I''ve been trying to add them once the parse is completed, but you need exact coordinates to place the text field, and I need it to be more dynamic than that.

I am using iTextSharp to attempt to accomplish this.

For example, let''s say I have a short HTML snippet for a 1x1 table(cell), and I have the placeholders in the cell.

<html>
<body> 
<table>
<tr>
<td>
<p>%SomePlaceholderText%</p>
</td>
</tr>
</table>
</body>
</html>




Question:
Is it possible that before the conversion takes place(or during, or after, really) for me to iterate through the text that will be written to the pdf, and look for placeholders, and replace them with the method to insert a text field? I have tried using TextExtractionStrategy, but I am not very seasoned in c# and it feels like it''s above my level of experience. All I want to do is replace the placeholder with a Text Field dynamically. Is this at all possible with iTextSharp, or should I try another library?

I found a great VB.Net solution on Stack Overflow for this particular requirement after what felt like days of searching. I am still in the process of adapting it to c#, and I am working on a few things at the moment, but as soon as I convert it to c# I''ll post a link here to the source code once it is completed underneath the VB.Net link.

Essentially what this solution does is to parse an existing pdf, searching for specific text values within it. If the extractor encounters matching text, it draws a pink rectangle around it. What it actually does with the extracted text is completely customizable though. I have it Drawing a rectangle around the text, and dropping a text field in it''s place. I''ll include the piece that needs to be swapped in VB to do this, so if anyone sees this, they can get the general idea of how to adapt the extracted text to fit your needs.

VB Link
http://stackoverflow.com/questions/6523243/how-to-highlight-a-text-or-word-in-a-pdf-file-using-itextsharp[^] It is the last answer, just click on the word HERE that is licked, and the download dialog will open.

c# Link

Coming soon

In Form1.vb under

Public Sub PDFTextGetter(ByVal pSearch As String, ByVal SC As StringComparison, ByVal SourceFile As String, ByVal DestinationFile As String)

And directly underneath the comment I include with the code, you can do what you please with the rectangle drawn around the extracted text.

'MatchesFound contains all text with locations, so do whatever you want with it, this highlights them using PINK color:
Dim AccountFields = 1
Dim MeterFields = 1
For Each rect As iTextSharp.text.Rectangle In MatchesFound
    cb.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height + 2)

    Dim field As New TextField(stamper.Writer, New iTextSharp.text.Rectangle(rect.Left, rect.Bottom, rect.Right, rect.Top + 2), "AccountNumber" & AccountFields)
    Dim form = stamper.AcroFields
    Dim fieldKeys = form.Fields.Keys
    stamper.AddAnnotation(field.GetTextField(), page)
    AccountFields += 1




The comment says it will fill in the rectangle, but I just left that in there so you can trace it to the original spot in the solution you download. Obviously this is not what it does in the snippet I provide. Anyone with a background with iTextsharp can piece that together, but I figured I would articulate it as well.

*Edit* Just a side note, the default writing mode for the ContentByte in this program is to GetUnderContent, so if you want to drop anything over that, simply toggle that to GetOverContent, and you will be golden.
Cheers!!


这篇关于HTML到PDF的转换,将占位符替换为文本字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆