PDF出血检测 [英] PDF bleed detection

查看:210
本文介绍了PDF出血检测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在编写一个小工具(Python + pyPdf)来测试PDF的打印机一致性.

I'm currently writing a little tool (Python + pyPdf) to test PDFs for printer conformity.

A,我已经对第一个任务感到困惑:检测PDF是否至少有3mm的出血"(页面周围没有打印任何内容的边框).我已经知道无法检测整个文档的出血,因为似乎没有一个全局的出血.但是在页面上,我可以检测到总共五个不同的框:

Alas I already get confused at the first task: Detecting if the PDF has at least 3mm 'bleed' (border around the pages where nothing is printed). I already got that I can't detect the bleed for the complete document, since there doesn't seem to be a global one. On the pages however I can detect a total of five different boxes:

  • mediaBox
  • bleedBox
  • trimBox
  • cropBox
  • artBox
  • mediaBox
  • bleedBox
  • trimBox
  • cropBox
  • artBox

我阅读了有关的 pyPdf文档这些框,但我唯一理解的是mediaBox,它似乎代表了整个页面的大小(即纸张).

I read the pyPdf documentation concerning those boxes, but the only one I understood is the mediaBox which seems to represent the overall page size (i.e. the paper).

bleedBox很明显应该来定义出血,但是似乎并非总是如此.

The bleedBox pretty obviously ought to define the bleed, but that doesn't always seem to be the case.

我注意到的另一件事是例如 PDF ,所有这些框在每页上的大小都完全相同(这意味着完全没有出血),但是当我打开它时,会有大量的出血;这使我认为各个文本元素都有自己的偏移量.

Another thing I noted was that for instance with the PDF, all those boxes have the exact same size (implying no bleed at all) on each page, but when I open it there's a huge amount of bleed; This leads me to think that the individual text elements have their own offset.

因此,很明显,仅根据mediaBoxbleedBox计算出血是不可行的.

So, obviously, just calculating the bleed from mediaBox and bleedBox is not a viable option.

如果有人能阐明这些盒子的实际含义以及从中得出的结论(例如,一个盒子总是比另一个盒子小),我会感到非常高兴.

奖励问题:有人可以告诉我默认用户空间单位" 到底是什么. html#pyPdf.pdf.PageObject-class"rel =" nofollow>文档?我很确定这是指我计算机上的mm,但是我想在所有地方强制执行mm.

Bonus question: Can someone tell me what exactly the "default user space unit" mentioned in the documentation? I'm pretty sure this refers to mm on my machine, but I'd like to enforce mm everywhere.

推荐答案

引用PDF规范

Quoting from the PDF specification ISO 32000-1:2008 as published by Adobe:

14.11.2页面边界

14.11.2 Page Boundaries

14.11.2.1常规

14.11.2.1 General

可以为完成的介质(例如 纸,或作为印前过程的一部分,其中内容 的页面放置在中间介质上,例如胶片或 施加繁殖板.在后一种情况下, 区分中间页和完成页.这 中间页通常可能包含与生产相关的其他内容 内容超出范围的内容,例如出血或打印机标记 完成页面的边界.要处理此类情况,请使用PDF页面 可以定义多达五个单独的边界来控制各种 成像过程的各个方面:

A PDF page may be prepared either for a finished medium, such as a sheet of paper, or as part of a prepress process in which the content of the page is placed on an intermediate medium, such as film or an imposed reproduction plate. In the latter case, it is important to distinguish between the intermediate page and the finished page. The intermediate page may often include additional production-related content, such as bleeds or printer marks, that falls outside the boundaries of the finished page. To handle such cases, a PDF page maydefine as many as five separate boundaries to control various aspects of the imaging process:

  • 媒体框定义了物理媒体的边界,在该边界上 该页面将被打印.它可能包括任何扩展区域 在完成的页面周围进行渗色,打印标记或其他类似处理 目的.它还可能包括靠近介质边缘的区域 由于输出的物理限制而无法标记的 设备.超出此范围的内容可以安全地丢弃 而不影响PDF文件的含义.

  • The media box defines the boundaries of the physical medium on which the page is to be printed. It may include any extended area surrounding the finished page for bleed, printing marks, or other such purposes. It may also include areas close to the edges of the medium that cannot be marked because of physical limitations of the output device. Content falling outside this boundary may safely be discarded without affecting the meaning of the PDF file.

裁切框定义页面内容所在的区域 显示或打印时,应剪掉(裁剪).不像其他 框,裁剪框在物理页面方面没有定义的含义 几何形状或预期用途;它只是在页面上施加剪切 内容.但是,在没有其他信息(例如 JDF或PJTF作业单中指定的拼版说明), 裁剪框确定页面内容应如何放置在 输出介质.默认值为页面的媒体框.

The crop box defines the region to which the contents of the page shall be clipped (cropped) when displayed or printed. Unlike the other boxes, the crop box has no defined meaning in terms of physical page geometry or intended use; it merely imposes clipping on the page contents. However, in the absence of additional information (such as imposition instructions specified in a JDF or PJTF job ticket), the crop box determines how the page’s contents shall be positioned on the output medium. The default value is the page’s media box.

出血框(PDF 1.3)定义了内容所在的区域 在生产环境中输出时,页面应该被剪切. 这可能包括为适应以下情况而需要的任何额外的出血区域: 切割,折叠和修剪设备的物理限制.这 实际打印的页面可能包含超出打印范围的打印标记 出血盒.默认值为页面的裁剪框.

The bleed box (PDF 1.3) defines the region to which the contents of the page shall be clipped when output in a production environment. This may include any extra bleed area needed to accommodate the physical limitations of cutting, folding, and trimming equipment. The actual printed page may include printing marks that fall outside the bleed box. The default value is the page’s crop box.

修剪框(PDF 1.3)定义了预期的尺寸. 修剪后完成的页面.它可能比媒体盒小 允许与生产相关的内容,例如打印说明, 切割痕迹或颜色条.默认值为页面的裁剪框.

The trim box (PDF 1.3) defines the intended dimensions of the finished page after trimming. It may be smaller than the media box to allow for production-related content, such as printing instructions, cut marks, or colour bars. The default value is the page’s crop box.

艺术框(PDF 1.3)定义了页面有意义的范围 网页预期的内容(包括潜在的空白) 创作者.默认值为页面的裁剪框.

The art box (PDF 1.3) defines the extent of the page’s meaningful content (including potential white space) as intended by the page’s creator. The default value is the page’s crop box.

页面对象字典在MediaBox中指定这些边界, 分别为CropBox,BleedBox,TrimBox和ArtBox条目(请参阅 表30).它们都是在默认用户空间中表示的矩形 单位.裁剪,出血,修剪和美术包装盒通常不应 扩展到媒体盒的边界之外.如果他们这样做,他们就是 有效地减少了它们与媒体盒的交集.数字 图86示出了这些边界之间的关系. (农作物箱 在图中未显示,因为它与 任何其他边界.)

The page object dictionary specifies these boundaries in the MediaBox, CropBox, BleedBox, TrimBox, and ArtBox entries, respectively (see Table 30). All of them are rectangles expressed in default user space units. The crop, bleed, trim, and art boxes shall not ordinarily extend beyond the boundaries of the media box. If they do, they are effectively reduced to their intersection with the media box. Figure 86 illustrates the relationships among these boundaries. (The crop box is not shown in the figure because it has no defined relationship with any of the other boundaries.)

下面有一个漂亮的图形,显示了这些框彼此之间的关系:

Following that there is a nice graphic showing those boxes in relation to each other:

在许多情况下,仅设置了媒体盒的原因是

The reasons why in many cases only the media box is set, are

  1. 对于PDF供电子消费(即在计算机上阅读)的情况,其他框几乎没有关系;和

  1. that in case of PDFs meant for electronic consumption (i.e. reading on a computer) the other boxes hardly matter; and

即使在印前环境中,它们也不再像以前那样需要,请参见. 文章 Pedro在其评论中提及.

that even in the prepress context they aren't as necessary anymore as they used to be, cf. the article Pedro refers to in his comment.

关于奖励问题":默认情况下,用户空间单位为1⁄72英寸;从PDF 1.6开始,可以使用页面字典中的UserUnit条目将其更改为该大小的任何(不是必需的整数)倍数.由于用户空间单位是页面的独立于设备的坐标系的基本单位,因此在现有PDF中对其进行更改实际上会对其进行缩放.因此,除非您要更新页面描述中引用坐标的每个命令以保持页面尺寸,否则您将不希望强制使用毫米级用户空间单位...;)

Concerning your "bonus question": The user space unit is 1⁄72 inch by default; since PDF 1.6 it can be changed, though, to any (not necessary integer) multiple of that size using the UserUnit entry in the page dictionary. Changing it in an existing PDF essentially scales it as the user space unit is the basic unit in the device independent coordinate system of a page. Therefore, unless you want to update each and every command in the page descriptions refering to coordinates to keep the page dimensions, you won't want to enforce a millimeter user space unit... ;)

这篇关于PDF出血检测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆