如何使用POI读取word文档中每个单词的字体大小? [英] How to read font size of each word in a word document using POI?
问题描述
我试图找出 word 文档中是否存在字体为 2 的任何内容.但是,我无法做到这一点.首先,我尝试读取只有一行和 7 个单词的示例 Word 文档中每个单词的字体.我没有得到正确的结果.
I am trying to find out whether there exist anything in the word document that has a font of 2. However, I have not been able to do this. To begin with, I've tried to read the font of each word in a sample word document that only has one line and 7 words. I am not getting the correct results.
这是我的代码:
HWPFDocument doc = new HWPFDocument (fileStream);
WordExtractor we = new WordExtractor(doc);
Range range = doc.getRange()
String[] paragraphs = we.getParagraphText();
for (int i = 0; i < paragraphs.length; i++) {
Paragraph pr = range.getParagraph(i);
int k = 0
while (true) {
CharacterRun run = pr.getCharacterRun(k++);
System.out.println("Color: " + run.getColor());
System.out.println("Font: " + run.getFontName());
System.out.println("Font Size: " + run.getFontSize());
if (run.getEndOffSet() == pr.getEndOffSet())
break;
}
}
然而,上面的代码总是将字体大小加倍.即如果文档中的实际字体大小为 12,则输出 24,如果实际字体为 8,则输出 16.
However, the above code always doubles the font size. i.e. if the actual font size in the document is 12 then it outputs 24 and if actual font is 8 then it outputs 16.
这是从word文档中读取字体大小的正确方法吗??
Is this the correct way to read font size from a word document ??
推荐答案
是的,这是正确的方法;测量值是半点.
Yes, that's the correct way; the measurement is in half points.
在 docx 中,您会看到以下内容:
In a docx, you'd have something like:
<w:rPr>
<w:sz w:val="28" />
</w:rPr>
@sz 上的 ECMA 376 规范 将单位定义为
与HWPF支持的二进制文档格式相同.如果您查看 [MS-DOC],你会看到它还指定了半点的文本大小.
Its the same with the binary doc format, which HWPF supports. If you look at [MS-DOC], you'll see it also specifies the size of text in half-points.
这篇关于如何使用POI读取word文档中每个单词的字体大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!