什么是我的PDF的字体名称多余的字符？ [英] What are the extra characters in the font name of my PDF?

查看：196 发布时间：2016/9/6 14:46:06 c# visual-studio-2010 itextsharp

本文介绍了什么是我的PDF的字体名称多余的字符？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

而从PDF中提取字体名称我会得到一些垃圾字符后跟加号，然后用字体样式的字体名称。我想删除一些PDF我得到乱码垃圾characters.only。例如：MMLPEO + RemingtonNoiseless

 字符串curFont = renderInfo.GetFont（）PostscriptFontName。

解决方案

垃圾字符表示，字体ISN ŧ完全嵌入。
你会找到你的名字，如ABC123 + RemingtonNoiseless，XYZ456 + RemingtonNoiseless，等...
这意味着有可能是PDF中相同的字体的不同子集。

有关解释看看9.6.4节的字体子集的PDF规范的 ISO 32000-1：2008 ：

有关字体子集的字体的PostScript名字 - 字体的 BASEFONT 项的值和字体描述符的 FONTNAME 项 - 将
开头标签后跟一个加号（+）。标签应包括整整6大写字母;字母的选择是任意的，
，但不同的子集在相同的PDF文件应具有不同的标记

实施例EOODIA + Poetica的一个子集的名称。Poetica®，一个Type 1字体

在换句话说，这些人物不仅是垃圾。
如果你想删除它们，那是想都不用想，只要使用适当的字符串操作方法，
但是要知道，移除它们扔掉，可能在某些情况下有用的信息。

while extracting font name from pdf i will get some junk characters followed by plus sign and then the font name with font style. i want to remove the junk characters.only for few pdf i get that junk characters. example:MMLPEO+RemingtonNoiseless
string curFont = renderInfo.GetFont().PostscriptFontName;
解决方案
The "junk" characters indicate that the font isn't embedded completely. You'll find names such as ABC123+RemingtonNoiseless, XYZ456+RemingtonNoiseless, etc... meaning that there may be different subsets of the same font inside the PDF.

For an explanation have a look at section 9.6.4 Font Subsets of the PDF specification ISO 32000-1:2008:

For a font subset, the PostScript name of the font — the value of the font’s BaseFont entry and the font descriptor’s FontName entry — shall begin with a tag followed by a plus sign (+). The tag shall consist of exactly six uppercase letters; the choice of letters is arbitrary, but different subsets in the same PDF file shall have different tags.

EXAMPLE EOODIA+Poetica is the name of a subset of Poetica®, a Type 1 font.

In other words: these characters aren't merely "junk". If you want to remove them, that's a no-brainer, just use the appropriate string manipulation method, but be aware that removing them throws away information that may be useful in some contexts.

这篇关于什么是我的PDF的字体名称多余的字符？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

什么是我的PDF的字体名称多余的字符？ [英] What are the extra characters in the font name of my PDF?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

什么是我的PDF的字体名称多余的字符？ [英] What are the extra characters in the font name of my PDF?

问题描述

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭