设置图像表单域 [英] Setting image form field

查看:75
本文介绍了设置图像表单域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个带有一个图像字段的示例 PDF 表单.我正在尝试使用 PDFBox 为字段设置图像.

我看到 PDFBox 将此类字段视为 PDPushButton 的实例,但我没有看到此类的接口公开处理图像的方法...

可以使用评论中的 URL 下载示例 PDF.

怎么做?

<小时>

编辑:

这是我目前所做的:

PDDocument pdfDocument = null;PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();如果(acroForm != null){PDPushButton field = (PDPushButton) acroForm.getField("test");PDImageXObject pdImageXObject = PDImageXObject.createFromFile("my_img.png", pdfDocument);列表小部件 = field.getWidgets();/** 该字段可能在文档中出现多次,我想对每个小部件(出现)重复这一点.*/for(PDAnnotationWidget 小部件:小部件){PDRectangle 矩形 = widget.getRectangle();//PDAppearanceDictionary appearanceDict = widget.getAppearance();/** 就我而言,当图像未使用 Acrobat DC 设置时,appearanceDict 为空.*//** 创建外观流并用图像填充它.*/PDAppearanceStream pdAppearanceStream = new PDAppearanceStream(pdfDocument);pdAppearanceStream.setResources(new PDResources());尝试(PDPageContentStream pdPageContentStream = new PDPageContentStream(pdfDocument,pdAppearanceStream)){pdPageContentStream.drawImage(pdImageXObject, rectangle.getLowerLeftX(), rectangle.getLowerLeftY(), pdImageXObject.getWidth(), pdImageXObject.getHeight());}pdAppearanceStream.setBBox(new PDRectangle(rectangle.getWidth(), rectangle.getHeight()));/** 创建只有一个外观(默认)的外观字典并将外观设置为小部件.*/PDAppearanceDictionary appearanceDict = new PDAppearanceDictionary();AppearanceDict.setNormalAppearance(pdAppearanceStream);widget.setAppearance(appearanceDict);}}ByteArrayOutputStream outStr = new ByteArrayOutputStream();pdfDocument.save(outStr);pdfDocument.close();

但是,生成的 PDF 在 Acrobat Reader 中不显示任何图像.

我的目标是从这个PDF开始使用 PDFBox 获取此 PDF.

解决方案

首先,pdf 标准 ISO 32000-2 没有指定图像字段".根本.某些专有的 pdf 生成器/编辑器(特别是 Adob​​e 产品)正在使用 JavaScript 使按钮字段的操作类似于 GUI 中图像的表单字段,特别是在他们自己的 pdf 查看器中.尽管如此,那些按钮是按钮,而不是图像表单字段.因此,

<块引用>

PDFBox 将此类字段视为 PDPushButton 的实例,但我没有看到此类的接口公开处理图像的方法...

如果您确实需要像图像字段这样的东西,则无需找到不同的解决方案来模拟图像字段,只需遵循 Adob​​e 的指导即可.人们只需要知道一个人只模拟图像字段.

要使用图像填充按钮字段,可以使用 AcroFormPopulator Renat Gatin他的回答中对他自己的问题如何使用 java PDFBox 以编程方式将图像插入到 AcroForm 字段中?".

但是,请注意,应用于您的示例文件时,它会揭示 PDFBox 表单扁平化代码中的错误.因此,您应该在 AcroFormPopulator 中停用表单扁平化,即删除其中的 acroForm.flatten().

有问题的错误是由于缺少转换:如果 XObject 用作​​表单字段的外观,则其边界框中的所有内容按规范自动移动到注释矩形中:<块引用>

1.外观的边界框(由其 BBox 条目指定)应使用 Matrix 进行转换,以生成具有任意方向的四边形.变换后的外观框是包围这个四边形的最小直立矩形.

2.应计算一个矩阵 A 来缩放和平移变换后的外观框以与注释矩形(由 Rect 条目指定)的边缘对齐.A 映射左下角(具有最小 xy 坐标的角)和右上角(具有最小 xy 坐标的角)最大的 xy 坐标)转换后的外观框到注释矩形的相应角.

3.矩阵应与A连接形成一个矩阵AA,该矩阵从外观坐标系映射到默认用户空间中的注释矩形

(ISO 32000-2,第 12.5.5 节外观流,算法:外观流)

当从页面内容中引用扁平化后的所述 XObject 时,此转换 AA 不再由查看器自动确定和应用,因此必须由表单扁平化器显式添加.

显然,PDFBox 表单扁平化至少不会创建 AA 矩阵,特别是它假设左下角的边界框始终位于坐标系的原点.

对于您的示例 PDF,情况并非如此,因此在此处展平可以有效地将展平的图像字段按钮移出屏幕.

PS:情况比预期的更奇怪,PDAcroForm.flatten(List<PDField>, boolean) 确实尝试确定是否需要对以前的外观 XObject 进行平移或缩放,并且如果它认为需要,则添加一个转换,但是

1.在PDAcroForm.resolveNeedsTranslation(PDAppearanceStream)中检查需要翻译时,实际上是检查外观XObject的表单XObject资源;当且仅当其中存在一个 XObject 的边界框且锚坐标都不为 0 时,才假定不需要平移.— 这是一个非常奇怪的测试,正确的测试必须检查外观 XObject 本身的边界框,而不是它包含的形式 XObject.在示例文档中,外观 XObject 不包含任何形式的 XObject,因此自动假定需要翻译.

2.添加平移变换时,它再次忽略外观 XObject 的边界框并进行平移,就好像外观 XObject 锚点位于坐标系原点一样.— 在示例文档中,这是完全不够的,因为边界框锚点已经位于那么远的地方,导致它被重新定位到距原点所需距离的两倍.

3.当在 PDAcroForm.resolveNeedsScaling(PDAppearanceStream) 中检查是否需要转换时,它实际上检查是否存在包含的任意 XObject,并假设如果存在此类包含的 XObject,则需要进行缩放.— 在示例文档中有一个图像 XObject,因此假定缩放是必要的......很奇怪.

这三个细节根本没有意义.(好吧,可能有一些示例文档,它们偶然会产生所需的结果,但总的来说,这是无稽之谈.)

I created a sample PDF form with one image field. I'm trying to set an image to the field using PDFBox.

I see that PDFBox treats such field as an instance of PDPushButton but I don't see this class' interface exposes methods to deal with images...

The sample PDF can be downloaded using the URL in comment.

How can it be done?


EDIT:

Here is what I'm doing so far:

PDDocument pdfDocument = null;
PDAcroForm acroForm = pdfDocument.getDocumentCatalog().getAcroForm();
if (acroForm != null) {
    PDPushButton field = (PDPushButton) acroForm.getField("test");

    PDImageXObject pdImageXObject = PDImageXObject.createFromFile("my_img.png", pdfDocument);

    List<PDAnnotationWidget> widgets = field.getWidgets();

    /*
     * The field may appear multiple times in the document, I would like to repeat that for every widget (occurence).
     */
    for(PDAnnotationWidget widget : widgets) {
        PDRectangle rectangle = widget.getRectangle();

        //PDAppearanceDictionary appearanceDict = widget.getAppearance();
        /*
         * In my case, when the image is not set with Acrobat DC, appearanceDict is null.
         */

        /*
         * Create the appearance stream and fill it with the image.
         */
        PDAppearanceStream pdAppearanceStream = new PDAppearanceStream(pdfDocument);
        pdAppearanceStream.setResources(new PDResources());
        try (PDPageContentStream pdPageContentStream = new PDPageContentStream(pdfDocument, pdAppearanceStream)) {
            pdPageContentStream.drawImage(pdImageXObject, rectangle.getLowerLeftX(), rectangle.getLowerLeftY(), pdImageXObject.getWidth(), pdImageXObject.getHeight());
        }
        pdAppearanceStream.setBBox(new PDRectangle(rectangle.getWidth(), rectangle.getHeight()));

        /*
         * Create the appearance dict with only one appearance (default) and set the appearance to the widget.
         */
        PDAppearanceDictionary appearanceDict = new PDAppearanceDictionary();
        appearanceDict.setNormalAppearance(pdAppearanceStream);
        widget.setAppearance(appearanceDict);
    }
}

ByteArrayOutputStream outStr = new ByteArrayOutputStream();
pdfDocument.save(outStr);
pdfDocument.close();

However, the generated PDF doesn't show any image with Acrobat Reader.

My goal is to start with this PDF and use PDFBox to get this PDF.

解决方案

To start with, the pdf standard ISO 32000-2 does not specify "image fields" at all. Certain proprietary pdf generators / editors (Adobe products in particular) are using JavaScript to make push button fields to operate similar to form fields for images in a GUI, in particular in their own pdf viewers. Nonetheless those push buttons are push buttons, not image form fields. Thus,

PDFBox treats such field as an instance of PDPushButton but I don't see this class' interface exposes methods to deal with images...

If you really need something like an image field, though, there is no need to find a different solution to emulate image fields, one can simply follow Adobe's lead. One merely has to be aware that one only emulates image fields.

To fill the push button field with an image, one can use the AcroFormPopulator Renat Gatin presents in his answer to his own question "How to insert image programmatically in to AcroForm field using java PDFBox?".

Beware, though, applied to your sample file it reveals a bug in the PDFBox form flattening code. Thus, you should deactivate form flattening in the AcroFormPopulator, i.e. remove the acroForm.flatten() in it.

The bug in question is due to a missing transformation: If an XObject is used as appearance of a form field, everything in its bounding box by specification is automatically moved into the annotation rectangle:

1. The appearance’s bounding box (specified by its BBox entry) shall be transformed, using Matrix, to produce a quadrilateral with arbitrary orientation. The transformed appearance box is the smallest upright rectangle that encompasses this quadrilateral.

2. A matrix A shall be computed that scales and translates the transformed appearance box to align with the edges of the annotation’s rectangle (specified by the Rect entry). A maps the lower-left corner (the corner with the smallest x and y coordinates) and the upper-right corner (the corner with the greatest x and y coordinates) of the transformed appearance box to the corresponding corners of the annotation’s rectangle.

3. Matrix shall be concatenated with A to form a matrix AA that maps from the appearance’s coordinate system to the annotation’s rectangle in default user space

(ISO 32000-2, section 12.5.5 Appearance streams, Algorithm: appearance streams)

When said XObject after flattening is referenced from the page content, this transformation AA is not anymore determined and applied automatically by the viewer, so it has to be explicitly added by the form flattener.

Apparently PDFBox form flattening at least does not create that AA matrix, in particular it assumes the bounding box lower left always to be at the origin of the coordinate system.

For your example PDF this is not the case, so flattening here effectively moves the flattened image field button off-screen.

PS: The situation is even weirder than expected, PDAcroForm.flatten(List<PDField>, boolean) does try to determine whether translation or scaling of a former appearance XObject is necessary and adds a transformation if it thinks it to be required but

1. when checking the need for translation in PDAcroForm.resolveNeedsTranslation(PDAppearanceStream), it actually checks the form XObject resources of the appearance XObject; if and only if there is an XObject among them with a bounding box with neither anchor coordinate being 0, it is assumed that no translation is required. — This is a very weird test, a proper test would have to check the bounding box of the appearance XObject itself, not of form XObjects it contains. In the sample document the appearance XObject does not contain any form XObjects, so translation is automatically assumed to be required.

2. when adding a translation transformation, it again ignores the bounding box of the the appearance XObject and translates as if the appearance XObject anchor was at the coordinate system origin. — In the sample document this is completely inadequate as the bounding box anchor is already located that far out, causing it to be relocated to twice the required distance from the origin.

3. when checking the need for translation in PDAcroForm.resolveNeedsScaling(PDAppearanceStream), it actually checks for the presence of contained arbitrary XObjects and assumes that scaling is required if there is such a contained XObject. — In the sample document there is an image XObject, so scaling is assumed to be necessary... weird.

These 3 details make no sense at all. (Well, there may have been some sample documents for which they by chance give rise to the desired result, but in general this is nonsense.)

这篇关于设置图像表单域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆