使用PDFBox将unicode字符串写入PDF [英] Using PDFBox to write unicode strings to a PDF

查看:138
本文介绍了使用PDFBox将unicode字符串写入PDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用Apache PDFBox 1.8.8来创建包含unicode字符的PDF,但我对支持的内容和不支持的内容感到困惑。

I want to use Apache PDFBox 1.8.8 to create a PDF that contains unicode characters but I'm confused about what is supported and what isn't.

此处发布的答案表明这是一个已在主干上修复的错误。

An answer posted here suggests it is a bug that has been fixed on the trunk.

此处发布的另一个答案表明我必须自己进行翻译。

Another answer posted here suggests that I have to do the translation myself.

另一个(较旧的)答案此处讨论嵌入字体。

And another (older) answer posted here talks about embedding fonts.

请有人澄清。此外,如果它是一个现在修复的错误,有人可以告诉我PDFBox的下一个版本可能是什么时候。

Please can someone clarify. Also, if it was a bug that is now fixed, can someone tell me when the next release of PDFBox is likely to be.

谢谢。

推荐答案

基本上你链接到的所有答案都是正确的。你必须记住他们分别引用的PDFBox版本。

Essentially all the answers you linked to are correct. You have to keep in mind which PDFBox version they respectively refer to.

关于这个答案

在2.0.0之前的版本中(最高为当前的1.8.8)文本绘制操作是非常有限的,甚至不支持完整的WinAnsi编码,这些版本生成的字体对象用作编码。

In the pre-2.0.0 versions (up to the current 1.8.8) the text drawing operations were very limited and didn't support even the full WinAnsi encoding which font objects generated by these versions used as encoding.

关于这个答案

当前的2.0.0-SNAPSHOT开发状态有很多改进。这意味着已删除文本绘制操作的限制,它们正确编码文本,并且正确编码和嵌入使用的字体。这些改进的早期实施中的错误大多已经修复。

The current 2.0.0-SNAPSHOT development state has much improved. This means that the limitations of the text drawing operations have been removed, they properly encode the text and the used fonts are properly encoded and embedded. Bugs in the early implementations of these improvements meanwhile have mostly been fixed.

关于这个答案

这个答案指出了一个人需要牢记的事情,无论使用哪个PDFBox版本:具体字体不一定支持整个Unicode范围的代码点。如果您使用的字体不包含字符的字形定义,则可以根据需要进行编码,从而无法正确绘制字符。这尤其涉及每个PDF查看器必须支持的标准14种字体:它们只需要支持少数拉丁文编码的字符,而不是完整的Unicode集。

This answer points to something one needs to keep in mind, no matter which PDFBox version one uses: specific fonts do not necessarily support the whole Unicode range of code points. If the font you use does not contain a glyph definition for a character, you can encode as much as you want, your character won't be drawn properly. This especially concerns the standard 14 fonts which every PDF viewer has to support: they need only support characters from a few Latin-style encodings, by far not the the full Unicode set.

这篇关于使用PDFBox将unicode字符串写入PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆