使用itextSharp替换pdf文档中的文本 [英] Replace the text in pdf document using itextSharp
问题描述
我想替换PDF文档中的特定文本。我目前正在使用 itextSharp
库来播放PDF文档。
I want to replace a particular text in PDF document. I am currently using itextSharp
library to play with PDF documents.
我从<$ c $中提取了字节c> pdfdocument 然后替换该字节,然后再次使用字节写入文档,但它不起作用。在下面的示例中,我尝试将字符串1234替换为5678
I had extracted the bytes from pdfdocument
and then replaced that byte and then write the document again with the bytes but it is not working. In the below example I am trying to replace string 1234 with 5678
任何有关如何执行此操作的建议都会有所帮助。
Any advise on how to perform this would be helpful.
PdfReader reader = new PdfReader(opf.FileNames[i]);
byte[] pdfbytes = reader.GetPageContent(1);
PdfString oldstring = new PdfString("1234");
PdfString newstring = new PdfString("5678");
byte[] byte1022 = oldstring.GetOriginalBytes();
byte[] byte1067 = newstring.GetOriginalBytes();
int position = 0;
for (int j = 0; j <pdfbytes.Length ; j++)
{
if (pdfbytes[j] == byte1022[0])
{
if (pdfbytes[j+1] == byte1022[1])
{
if (pdfbytes[j+2] == byte1022[2])
{
if (pdfbytes[j+3] == byte1022[3])
{
position = j;
break;
}
}
}
}
}
pdfbytes[position] = byte1067[0];
pdfbytes[position + 1] = byte1067[1];
pdfbytes[position + 2] = byte1067[2];
pdfbytes[position + 3] = byte1067[3];
File.WriteAllBytes(opf.FileNames[i].Replace(".pdf","j.pdf"), pdfbytes);
推荐答案
是什么让你觉得1234是网页内容的一部分流而不是XObject的形式?如果你不解析页面的所有资源,你的代码将永远无法正常工作。
What makes you think 1234 is part of the page's content stream and not of a form XObject? Your code is never going to work in general if you don't parse all the resources of a page.
另外:我看到 GetPageContent()
,但我没有看到你在任何地方使用 SetPageContent()
。如何将更改存储在PdfReader对象中?
Also: I see GetPageContent()
, but I don't see you using SetPageContent()
anywhere. How are the changes ever going to be stored in the PdfReader object?
此外,我没有看到你使用 PdfStamper
将改变后的PdfReader内容写入文件。
Moreover, I don't see you using PdfStamper
to write the altered PdfReader contents to a file.
最后:我不敢引用Adobe的PDF架构师Leonard Rosenthol的话,但问他,他会亲自告诉你,你不应该做你想做的事。 PDF不是编辑格式。阅读我在iText上写的书第6章的介绍: http ://www.manning.com/lowagie2/samplechapter6.pdf
Finally: I'm to shy to quote the words of Leonard Rosenthol, Adobe's PDF Architect, but ask him, and he'll tell you personally that you shouldn't do what you're trying to do. PDF is NOT a format for editing.Read the intro of chapter 6 of the book I wrote on iText: http://www.manning.com/lowagie2/samplechapter6.pdf
这篇关于使用itextSharp替换pdf文档中的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!