使用 PDFBox 将 UTF-8 编码的字符串写入 PDF [英] Using PDFBox to write UTF-8 encoded strings to a PDF

查看:58
本文介绍了使用 PDFBox 将 UTF-8 编码的字符串写入 PDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用 PDFBox 将 unicode 字符写入 PDF 时遇到问题.这是一些生成垃圾字符而不是输出š"的示例代码.我可以添加什么来获得对 UTF-8 字符串的支持?

I am having trouble writing unicode characters out to a PDF using PDFBox. Here is some sample code that generates garbage characters instead of outputting "š". What can I add to get support for UTF-8 strings?

PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage(page);
PDPageContentStream contentStream = new PDPageContentStream(document, page);

PDType1Font font = PDType1Font.HELVETICA;
contentStream.setFont(font, 12);
contentStream.beginText();
contentStream.moveTextPositionByAmount(100, 400);
contentStream.drawString("š");
contentStream.endText();
contentStream.close();
document.save("test.pdf");
document.close();

推荐答案

您正在使用 Adob​​e Reader 提供的内置Base 14"字体之一.这些字体不是 Unicode;它们实际上是一个标准的拉丁字母表,尽管有几个额外的字符.看起来您提到的字符,带有 Caron (š) 的小写 s,在 PDF 拉丁文本中不可用......虽然大写 Š 可用,但奇怪的是仅在 Windows 上.请参阅位于 http://www.adobe.com/devnet/pdf 的 PDF 规范的附录 D/pdf_reference.html 了解详情.

You are using one of the inbuilt 'Base 14' fonts that are supplied with Adobe Reader. These fonts are not Unicode; they are effectively a standard Latin alphabet, though with a couple of extra characters. It looks like the character you mention, a lowercase s with a caron (š), is not available in PDF Latin text... though an uppercase Š is available but curiously on Windows only. See Appendix D of the PDF specification at http://www.adobe.com/devnet/pdf/pdf_reference.html for details.

无论如何,进入正题……如果您想使用 Unicode 字符,则需要嵌入 Unicode 字体.确保您有权嵌入您决定的任何字体......我可以推荐开源 GentiumDoulos 字体,因为它们是免费的、高质量的并且具有全面的 Unicode 支持.

Anyway, getting to the point... you need to embed a Unicode font if you want to use Unicode characters. Make sure you are licensed to embed whatever font you decide on... I can recommend the open-source Gentium or Doulos fonts because they're free, high quality and have comprehensive Unicode support.

这篇关于使用 PDFBox 将 UTF-8 编码的字符串写入 PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆