iTextSharp替换现有PDF中的文本而不会失去形成 [英] iTextSharp Replace Text in existing PDF without loosing formation

查看:284
本文介绍了iTextSharp替换现有PDF中的文本而不会失去形成的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在互联网上搜索2周,并为我的问题找到了一些有趣的解决方案,但似乎没有给我答案。

I' ve been searching the Internet for 2 Weeks and found some interesting solutions for my Problem, but nothing seems to give me the answer.

我的目标是执行下面的操作:

My goal is to do the folowing:

我想在静态PDF文件中找到一个文本,并将此文本替换为另一个文本。
我想保留内容的设计。这真的很难吗?

I want to find a Text in a static PDF-File and replace this text with another text. I would like to keep the design of the content. Is it really that hard?

我找到了办法,但我丢失了所有信息:

I found a way but I lost the whole information:

 using (PdfReader reader = new PdfReader(path))
        {

            StringBuilder text = new StringBuilder();
            for (int i = 1; i <= reader.NumberOfPages; i++)
            {
                text.Append(PdfTextExtractor.GetTextFromPage(reader, i));
                text.Replace(txt_SuchenNach.Text, txt_ErsetzenMit.Text);
            }

            return text.ToString();
        }

我的第二次尝试更好,但需要我可以改变的领域里面的文字:

The second try I had was way better, but needs fields where I can change the text inside:

 string fileNameExisting =path;
        string fileNameNew = @"C:\TEST.pdf";

        using (FileStream existingFileStream = new FileStream(fileNameExisting, FileMode.Open))
        using (FileStream newFileStream = new FileStream(fileNameNew, FileMode.Create))
        {
            // PDF öffnen
            PdfReader pdfReader = new PdfReader(existingFileStream);


            PdfStamper stamper = new PdfStamper(pdfReader, newFileStream);

            var form = stamper.AcroFields;
            var fieldKeys = form.Fields.Keys;
            foreach (string fieldKey in fieldKeys)
            {                    
                var value = pdfReader.AcroFields.GetField(fieldKey);
                form.SetField(fieldKey, value.Replace(txt_SuchenNach.Text, txt_ErsetzenMit.Text));
            }

            // Textfeld unbearbeitbar machen (sieht aus wie normaler text)
            stamper.FormFlattening = true;

            stamper.Close();
            pdfReader.Close();
        }

这样可以保留其余文本的格式,并且只会更改我的搜索文本。我需要一个不在文本字段中的文本解决方案。

This keeps the formatation of the rest of text and does only change my searched text. I need a solution for text which is NOT in a Textfield.

感谢您的所有答案和帮助。

thanks for all your answers and your help.

推荐答案

一般问题是文本对象可能使用嵌入字体,并将特定字形分配给特定字母。即如果你有一个文本对象有一些像abcdef这样的文本,那么嵌入字体可能只包含这些(abcdef字母)的字形,但不包含其他字母的字形。因此,如果您将abcdef替换为xyz,则PDF将不会显示这些xyz,因为没有可用于显示这些字母的字形。

The general issue is that text objects may use embedded fonts with specific glyphs assigned to specific letters. I.e. if you have a text object with some text like "abcdef" then the embedded font may contain glyphs for these ("abcdef" letters) only but not for other letters. So if you replace "abcdef" with "xyz" then the PDF will not display these "xyz" as no glyphs are available for these letters to be displayed.

所以我会考虑以下工作流程:

So I would consider the following workflow:


  • 遍历所有文本对象;

  • 添加新文本对象,从头开始创建PDF文件并设置相同的属性(字体,位置等),但文本不同;此步骤可能要求您在原始PDF中使用相同的字体,但您可以检查已安装的字体并使用另一种字体作为新的文本对象。这样,iTextSharp或其他PDF工具将为新文本对象嵌入新的字体对象。

  • 创建重复的文本对象后删除原始文本对象;

  • 使用上述工作流处理每个文本对象;

  • 将修改后的PDF文档保存到新文件中。

  • Iterate through all the text objects;
  • Add new text objects created from scratch on top of PDF file and set the same properties (font, position, etc) but with a different text; This step could require you to have the same fonts installed on your as were used in the original PDF but you may check for installed fonts and use another font for a new text object. This way iTextSharp or another PDF tool will embed a new font object for a new text object.
  • Remove original text object once you have created a duplicated text object;
  • Process every text object with the workflow described above;
  • Save the modified PDF document into a new file.

这篇关于iTextSharp替换现有PDF中的文本而不会失去形成的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆