替换PDF文件中的数据 [英] Replace data in a PDF file

查看:494
本文介绍了替换PDF文件中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须在<< >> 之间替换字符串。但是我无法这样做。

I have to replace string between << and >>. However I'm unable to do so.

public void doIt( String inputFile, String outputFile) throws IOException, COSVisitorException
{

    PDDocument doc = null;
    try
    {
        doc = PDDocument.load( inputFile );
        List pages = doc.getDocumentCatalog().getAllPages();
        for( int i=0; i<pages.size(); i++ )
        {
            PDPage page = (PDPage)pages.get( i );
            PDStream contents = page.getContents();
            PDFStreamParser parser = new PDFStreamParser(contents.getStream());
            parser.parse();
            List tokens = parser.getTokens();
            for( int j=0; j<tokens.size(); j++ )
            {
                Object next = tokens.get( j );
                if( next instanceof PDFOperator )
                {

                    PDFOperator op = (PDFOperator)next;
                    if( op.getOperation().equals( "Tj" ))

                    {
                        Scanner in = new Scanner(System.in);
                        COSString previous = (COSString)tokens.get( j-1 );
                        String string = previous.getString();
                        if(string.startsWith("<<") && string.endsWith(">>"))
                        {
                        System.out.println(string);
                        System.out.println("enter the word to be replaced");
                        String string2=in.nextLine();
                        string = string.replaceAll( string, string2 );
                        previous.reset();
                        previous.append( string.getBytes() );
                        }
                    }     
                    else if( op.getOperation().equals( "TJ" ))
                    {
                        COSArray previous = (COSArray)tokens.get( j-1 );
                        for( int k=0; k<previous.size(); k++ )
                        {
                            Scanner in = new Scanner(System.in);
                            Object arrElement = previous.getObject( k );
                            if(arrElement instanceof COSString)
                            {
                                COSString cosString = (COSString)arrElement;
                                String string = cosString.getString();
                                if(string.startsWith("<<") && string.endsWith(">>"))
                                {
                                    System.out.println(string);
                                    System.out.println("enter the word to be replaced");
                                    String string2=in.nextLine();
                                    string = string.replaceAll( string, string2 );
                                    cosString.reset();
                                    cosString.append( string.getBytes());
                                }
                            }
                        }
                    }
                }
            }
            PDStream updatedStream = new PDStream(doc);
            OutputStream out = updatedStream.createOutputStream();
            ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
            tokenWriter.writeTokens(tokens);
            page.setContents(updatedStream);
        }
        doc.save( outputFile );
        System.out.println("Done!! Now You can Open.");
    }
    finally
    {
        if( doc != null )
        {
            doc.close();
        }
    }
}


推荐答案

请阅读本书第6章的介绍。您假设PDF是用于编辑文本的格式。 PDF不是为文字处理而设计的。

Please read the intro of chapter 6 of my book. You're assuming that PDF is a format for editing text. PDF wasn't designed for word processing.

当然:也许您正在询问如何创建静态表单,如本书第6.3.5节所述,我怀疑AcroForm技术的静态特性将满足您的需求。纯XFA表单(动态PDF)可以解决您的问题,但解释XFA不是可以在SO的答案范围内完成的。 XFA规范长达数百页。正如Duncan Jones的评论所示,你应该先做一些初步的工作。

Of course: maybe you're asking how to create a static form as explained in section 6.3.5 of my book, but I doubt the static nature of AcroForm technology will meet your needs. A pure XFA form (dynamic PDF) may solve your problem, but explaining XFA isn't something that can be done within the scope of an answer on SO. The XFA spec is several hundreds of pages long. As indicated in the comments by Duncan Jones, you should first do some preliminary work.

这篇关于替换PDF文件中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆