使用itextSharp替换pdf文档中的文本 [英] Replace the text in pdf document using itextSharp

查看:170
本文介绍了使用itextSharp替换pdf文档中的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想替换PDF文档中的特定文本。我目前正在使用 itextSharp 库来播放PDF文档。

I want to replace a particular text in PDF document. I am currently using itextSharp library to play with PDF documents.

我从<$ c $中提取了字节c> pdfdocument 然后替换该字节,然后再次使用字节写入文档,但它不起作用。在下面的示例中,我尝试将字符串1234替换为5678

I had extracted the bytes from pdfdocument and then replaced that byte and then write the document again with the bytes but it is not working. In the below example I am trying to replace string 1234 with 5678

任何有关如何执行此操作的建议都会有所帮助。

Any advise on how to perform this would be helpful.

PdfReader reader = new PdfReader(opf.FileNames[i]);
byte[] pdfbytes = reader.GetPageContent(1);

PdfString oldstring = new PdfString("1234");
PdfString newstring = new PdfString("5678");
byte[] byte1022 = oldstring.GetOriginalBytes();
byte[] byte1067 = newstring.GetOriginalBytes();
int position = 0;
for (int j = 0; j <pdfbytes.Length ; j++)
{
    if (pdfbytes[j] == byte1022[0])
    {
        if (pdfbytes[j+1] == byte1022[1])
        {
            if (pdfbytes[j+2] == byte1022[2])
            {
                if (pdfbytes[j+3] == byte1022[3])
                {
                    position = j;
                    break; 
                }
            }
        }

    }

}

pdfbytes[position] = byte1067[0];
pdfbytes[position + 1] = byte1067[1];
pdfbytes[position + 2] = byte1067[2];
pdfbytes[position + 3] = byte1067[3];
File.WriteAllBytes(opf.FileNames[i].Replace(".pdf","j.pdf"), pdfbytes);


推荐答案

是什么让你觉得1234是网页内容的一部分流而不是XObject的形式?如果你不解析页面的所有资源,你的代码将永远无法正常工作。

What makes you think 1234 is part of the page's content stream and not of a form XObject? Your code is never going to work in general if you don't parse all the resources of a page.

另外:我看到 GetPageContent() ,但我没有看到你在任何地方使用 SetPageContent()。如何将更改存储在PdfReader对象中?

Also: I see GetPageContent(), but I don't see you using SetPageContent() anywhere. How are the changes ever going to be stored in the PdfReader object?

此外,我没有看到你使用 PdfStamper 将改变后的PdfReader内容写入文件。

Moreover, I don't see you using PdfStamper to write the altered PdfReader contents to a file.

最后:我不敢引用Adobe的PDF架构师Leonard Rosenthol的话,但问他,他会亲自告诉你,你不应该做你想做的事。 PDF不是编辑格式。阅读我在iText上写的书第6章的介绍: http ://www.manning.com/lowagie2/samplechapter6.pdf

Finally: I'm to shy to quote the words of Leonard Rosenthol, Adobe's PDF Architect, but ask him, and he'll tell you personally that you shouldn't do what you're trying to do. PDF is NOT a format for editing.Read the intro of chapter 6 of the book I wrote on iText: http://www.manning.com/lowagie2/samplechapter6.pdf

这篇关于使用itextSharp替换pdf文档中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆