TextPosition 边界框 PDFBox [英] TextPosition Bounding Box PDFBox
问题描述
我正在尝试从 TextPosition 绘制相应的字形边界框,如 PDF 32000 文档中所示.
I am trying, from a TextPosition, to draw the corresponding glyph bounding box as shown in the PDF 32000 documentation.
这是我的函数,它执行从字形空间到用户空间
Here is my function that does the computation from glyph space to user space
@Override
protected void processTextPosition(TextPosition text) {
PDFont font = pos.getFont();
BoundingBox bbox = font.getBoundingBox();
Rectangle2D.Float rect = new Rectangle2D.Float(bbox.getLowerLeftX(), bbox.getUpperRightY(),
bbox.getWidth(), bbox.getHeight());
AffineTransform at = pos.getTextMatrix().createAffineTransform();
if (font instanceof PDType3Font) {
at.concatenate(font.getFontMatrix().createAffineTransform());
} else {
at.scale(1 / 1000f, 1 / 1000f);
}
Shape shape = at.createTransformedShape(rect);
rectangles.add(fillBBox(text));
super.processTextPosition(text);
}
这是绘制提取矩形的函数:
And Here is the function that draws the extracted rectangles:
private void drawBoundingBoxes() throws IOException {
String fileNameOut = path.substring(0, path.lastIndexOf(".")) + "_OUT.pdf";
log.info("Drawing Bounding Boxes for TextPositions");
PDPageContentStream contentStream = new PDPageContentStream(document,
document.getPage(document.getNumberOfPages()-1),
PDPageContentStream.AppendMode.APPEND, false , true );
contentStream.setLineWidth(1f);
contentStream.setStrokingColor(Color.RED);
try{
for (Shape p : rectangles) {
p = all.get(0);
double[] coords = new double[6];
GeneralPath g = new GeneralPath(p.getBounds2D());
for (PathIterator pi = g.getPathIterator(null);
!pi.isDone();
pi.next()) {
System.out.println(Arrays.toString(coords));
switch (pi.currentSegment(coords)) {
case PathIterator.SEG_MOVETO:
System.out.println("move to");
contentStream.moveTo ((float)coords[0], (float) coords[1]);
break;
case PathIterator.SEG_LINETO:
System.out.println("line to");
contentStream.lineTo ((float)coords[0], (float) coords[1]);
break;
case PathIterator.SEG_CUBICTO:
System.out.println("cubc to");
contentStream.curveTo((float)coords[0], (float) coords[1],
(float)coords[2], (float) coords[3],
(float)coords[4],(float) coords[5]);
break;
case PathIterator.SEG_CLOSE:
System.out.println("close");
contentStream.closeAndStroke();
break;
default:
System.out.println("no shatt");
break;
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
contentStream.close();
document.save(new File(fileNameOut));
}
}
然后当我尝试在 pdf 上绘制时,第一个字母(大写 V)得到以下结果
Then when I try to draw on the pdf I get the following result for the first letter (the capital V)
我不知道我做错了什么.有什么想法吗?
I can't figure out what I am doing wrong. Any ideas?
推荐答案
Mr.D,
我测试了你的代码,我需要做的唯一改变就是反转 Y 轴.之所以需要这样做,是因为 PDF 用户空间 的原点位于左下角,而 Java 2D 用户空间 的原点位于左下角.左上角[1].
I tested your code and the only change I needed to make it work was to invert the Y axis. The reason this is needed is because the origin in the PDF User Space is located at the bottom-left corner, unlike the origin of the Java 2D User Space which is located on the top-left corner[1].
8.3.2.3 用户空间
对于文档的每一页,用户空间坐标系应初始化为默认状态.页面字典中的CropBox 条目应指定与预期输出媒体(显示窗口或打印页面)的可见区域相对应的用户空间矩形.正 x 轴水平向右延伸,正 y 轴垂直向上延伸,如标准数学实践中一样(根据页面字典中的 旋转 条目进行更改).单位沿 x 轴和 y 轴的长度由页面字典中的 UserUnit 条目 (PDF 1.6) 设置(参见表 30).如果该条目不存在或不支持,则使用默认值 1⁄72 英寸.该坐标系称为默认用户空间.[2]
The user space coordinate system shall be initialized to a default state for each page of a document. The CropBox entry in the page dictionary shall specify the rectangle of user space corresponding to the visible area of the intended output medium (display window or printed page). The positive x axis extends horizontally to the right and the positive y axis vertically upward, as in standard mathematical practice (subject to alteration by the Rotate entry in the page dictionary). The length of a unit along both the x and y axes is set by the UserUnit entry (PDF 1.6) in the page dictionary (see Table 30). If that entry is not present or supported, the default value of 1⁄72 inch is used. This coordinate system is called default user space.[2]
源代码
@Override
protected void processTextPosition(TextPosition text) {
try {
PDFont font = pos.getFont();
BoundingBox bbox = font.getBoundingBox();
Rectangle2D.Float rect = new Rectangle2D.Float(bbox.getLowerLeftX(), bbox.getUpperRightY(),
bbox.getWidth(), bbox.getHeight());
AffineTransform at = pos.getTextMatrix().createAffineTransform();
if (font instanceof PDType3Font) {
at.concatenate(font.getFontMatrix().createAffineTransform());
} else {
at.scale(1 / 1000f, 1 / 1000f);
}
Shape shape = at.createTransformedShape(rect);
// Invert Y axis
Rectangle2D bounds = shape.getBounds2D();
bounds.setRect(bounds.getX(), bounds.getY() - bounds.getHeight(), bounds.getWidth(), bounds.getHeight());
rectangles.add(bounds);
super.processTextPosition(text);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
参考资料
文档管理 - 便携式文档格式 - 第 1 部分:PDF 1.7,PDF 32000-1:2008,第 8.3 节:坐标系,第 115 页
Document management - Portable document format - Part 1: PDF 1.7, PDF 32000-1:2008, Section 8.3: Coordinate Systems, page 115
这篇关于TextPosition 边界框 PDFBox的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!