从交互式表格pdf获取名称字段 [英] Get name's field from interactive form pdf

查看:295
本文介绍了从交互式表格pdf获取名称字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

早上好,



我不知道,我怎样才能阅读pdf下方的字段名称表格。
我使用了AcroFields的所有方法,但所有方法都返回0或null
XfaMovies



嗯,这里有两个新的例子:





当然:我不懂波兰语,所以我并不总是填写正确的值,但现在至少你不再有理由问我在哪里可以找到示例代码?



更新2:



在额外评论中,您声称可以'在数据结构的任何地方找到NIP号码(表格中的数字10)。



这意味着要么你没有检查过 data.xml ,或者您不理解XML。



请允许我显示包含NIP编号的XML的相关部分:

 < Deklaracja xmlns =http://crd.gov.pl/wzor/2014/12/05/1880/xmlns:etd = http://crd.gov.pl/xml/schematy/dziedzinowe/mf/2011/06/21/eD/DefinicjeTypy/\"> 
....
< Podmiot2 rola =Podatnik>
< etd:OsobaFizyczna>
< etd:NIP> 0123456789< / etd:NIP>
< etd:ImiePierwsze> JUST TRY< / etd:ImiePierwsze>
< etd:Nazwisko> DUDE< / etd:Nazwisko>
< etd:DataUrodzenia> 2015-02-19< / etd:DataUrodzenia>
< / etd:OsobaFizyczna>
< / Podmiot2>
...
< / Deklaracja>

换句话说,您正在寻找的字段名称可能是这样的: Deklaracja [0] .Podmiot2 [0] .OsobaFizyczna [0] .NIP [0] (无论这些词是什么意思,我只知道一个波兰语单词:Podpis)。


Good Morning,

I don't know, how can i read the field name form below pdf. I used all methods for AcroFields, but all methods returns 0 or null http://www.finanse.mf.gov.pl/documents/766655/1481810/PIT-8C(7)_v1-0E.pdf

my code:

try {
        PdfReader.unethicalreading = true;
        PdfReader reader = new PdfReader(new FileInputStream("/root/TestPit8/web/notmod.pdf"));

        PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("/root/TestPit8/web/testpdf.pdf"));
        AcroFields form = stamper.getAcroFields();


        form.setField("text_1", "666");
        form.setField("text_2", "666");
        form.setField("text_3", "666");
        form.setFieldProperty("text_3", "clrfflags", TextField.PASSWORD, null);
        form.setFieldProperty("text_3", "setflags", PdfAnnotation.FLAGS_PRINT, null);
        form.setField("text_3", "12345678", "xxxxxxxx");
        form.setFieldProperty("text_4", "textsize", new Float(12), null);
        form.regenerateField("text_4");
        stamper.close();
        reader.close();
        } catch( Exception ex) {
            ex.printStackTrace();
        }

Thx forhelp

解决方案

The form you share is a pure XFA form. XFA stands for the XML Forms Architecture.

Please read The Best iText Questions on StackOverflow and scroll to the section entitled "Interactive forms".

These are the first two questions of this section:

You are filling out the form as if it were based on AcroForm technology. That isn't supposed to work, is it? Your form is an XFA form!

Filling out an XFA form is explained in my book, in the XfaMovies example:

public void manipulatePdf(String src, String xml, String dest)
    throws IOException, DocumentException {
    PdfReader reader = new PdfReader(src);
    PdfStamper stamper = new PdfStamper(reader,
            new FileOutputStream(dest));
    AcroFields form = stamper.getAcroFields();
    XfaForm xfa = form.getXfa();
    xfa.fillXfaForm(new FileInputStream(xml));
    stamper.close();
    reader.close();
}

In this case, src is a path to the original form, xml is a path to the XML data, and dest is the path of the filled out form.

If you want to read the data, you need the XfaMovie example:

This reads the full form (all the XFA):

public void readXfa(String src, String dest)
    throws IOException, ParserConfigurationException, SAXException,
        TransformerFactoryConfigurationError, TransformerException {
    FileOutputStream os = new FileOutputStream(dest);
    PdfReader reader = new PdfReader(src);
    XfaForm xfa = new XfaForm(reader);
    Document doc = xfa.getDomDocument();
    Transformer tf = TransformerFactory.newInstance().newTransformer();
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    tf.transform(new DOMSource(doc), new StreamResult(os));
    reader.close();
}

If you only want the data, you need to examine the datasets node:

public void readData(String src, String dest)
    throws IOException, ParserConfigurationException, SAXException,
        TransformerFactoryConfigurationError, TransformerException {
    FileOutputStream os = new FileOutputStream(dest);
    PdfReader reader = new PdfReader(src);
    XfaForm xfa = new XfaForm(reader);
    Node node = xfa.getDatasetsNode();
    NodeList list = node.getChildNodes();
    for (int i = 0; i < list.getLength(); i++) {
        if("data".equals(list.item(i).getLocalName())) {
            node = list.item(i);
            break;
        }
    }
    list = node.getChildNodes();
    for (int i = 0; i < list.getLength(); i++) {
        if("movies".equals(list.item(i).getLocalName())) {
            node = list.item(i);
            break;
        }
    }
    Transformer tf = TransformerFactory.newInstance().newTransformer();
    tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
    tf.setOutputProperty(OutputKeys.INDENT, "yes");
    tf.transform(new DOMSource(node), new StreamResult(os));
    reader.close();
}

Note that I don't understand why you think there are fields such as text_1, text_2 in the form. XFA fields are easy to recognize because the contain plenty of [] characters.

Also: from the screenshot below (taken with iText RUPS), it is clear that there are no such fields in the form:

The tools are there on the iText web site. The documentation is there. Please use it!

Update:

So... instead of accepting my comprehensive answer, you decided to post a comment asking me to do your work in your place by asking where I can find example code? in spite of the fact that I provided links to XfaMovie and XfaMovies.

Well, here are two new examples for you:

Of course: I don't understand Polish, so I didn't always fill out the correct values, but now at least you have no longer a reason to ask where I can find example code?

Update 2:

In an extra comment, you claim that you can't find the NIP number (number 10 in the form) anywhere in the data structure.

This means either that you haven't examined data.xml, or that you don't understand XML.

Allow me to show the relevant part of the XML that contains the NIP number:

<Deklaracja xmlns="http://crd.gov.pl/wzor/2014/12/05/1880/" xmlns:etd="http://crd.gov.pl/xml/schematy/dziedzinowe/mf/2011/06/21/eD/DefinicjeTypy/">
    ....
    <Podmiot2 rola="Podatnik">
        <etd:OsobaFizyczna>
            <etd:NIP>0123456789</etd:NIP>
            <etd:ImiePierwsze>JUST TRY</etd:ImiePierwsze>
            <etd:Nazwisko>DUDE</etd:Nazwisko>
            <etd:DataUrodzenia>2015-02-19</etd:DataUrodzenia>
        </etd:OsobaFizyczna>
    </Podmiot2>
    ...
</Deklaracja>

In other words, the field name you're looking for is probably something like this: Deklaracja[0].Podmiot2[0].OsobaFizyczna[0].NIP[0] (whatever these words may mean, I only know one Polish word: Podpis).

这篇关于从交互式表格pdf获取名称字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆