使用 Aphace POI 双向使用 word 文档 [英] bidirectional with word document using Aphace POI
问题描述
我正在尝试将一些希伯来语文本添加到 Word 文档中,它工作正常,但是当我添加标点符号时,它变得凌乱.
I am trying to add some Hebrew text into a word document and it's work fine but when I add punctuation it's getting messy.
这是我运行的代码:
public static void main(String[] args) throws Exception {
XWPFDocument document = new XWPFDocument();
XWPFParagraph paragraph = document.createParagraph();
paragraph.setAlignment(ParagraphAlignment.LEFT);
// make RTL direction
CTP ctp = paragraph.getCTP();
CTPPr ctppr;
if ((ctppr = ctp.getPPr()) == null) {
ctppr = ctp.addNewPPr();
}
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("שלום עולם !");
// create the document in the specific path by giving it a name
File newFile = new File("helloWorld.docx");
// insert document to newFile
try {
FileOutputStream output = new FileOutputStream(newFile);
document.write(output);
output.close();
document.close();
} catch (Exception e) {
e.printStackTrace();
}
}
这是我得到的helloWorld.docx":
This is the "helloWorld.docx" I get:
这就是它需要的样子:
此外,我希望整个文档都是 RTL(即使是双向的),而不仅仅是特定的段落.
Moreover, I want the whole document to be RTL (even with bidirectional) and not just the specific paragraph.
感谢您的帮助!
推荐答案
这是一个众所周知的使用双向文本的问题.感叹号以及空格本身不是从右到左的字符.因此,如果需要,我们需要将它们标记为这样.RIGHT-TO-LEFT MARK (RLM)
是 U+200F
.请参阅https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types.
That's a well known problem using bidirectional text. The exclamation mark, as well as the space are not right-to-left characters themselves. So we need mark them as such, if needed. The RIGHT-TO-LEFT MARK (RLM)
is U+200F
. See https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types.
以下代码对我有用:
import java.io.FileOutputStream;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;
public class CreateWordRTLParagraph {
public static void main(String[] args) throws Exception {
XWPFDocument doc= new XWPFDocument();
XWPFParagraph paragraph = doc.createParagraph();
CTP ctp = paragraph.getCTP();
CTPPr ctppr;
if ((ctppr = ctp.getPPr()) == null) ctppr = ctp.addNewPPr();
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("שלום עולם \u200F!\u200F");
FileOutputStream out = new FileOutputStream("WordDocument.docx");
doc.write(out);
out.close();
doc.close();
}
}
注意 \u200F
标记 之后 空格和感叹号.
Note the \u200F
mark after space and exclamation mark.
如果文本行来自文件,那么标记单个字符将不是最佳做法.然后整个文本行应该被标记为从右到左的文本.为此,我们可以将文本行嵌入 U+202E RIGHT-TO-LEFT OVERRIDE (RLO)
后跟 U+202C POP DIRECTIONAL FORMATTING (PDF)
.
If the text lines are coming from a file, then marking single characters will not be best practice.
Then the whole text line should be marked as right-to-left text.
To do so we can embed the text lines in a U+202E RIGHT-TO-LEFT OVERRIDE (RLO)
followed by a U+202C POP DIRECTIONAL FORMATTING (PDF)
.
示例:
import java.io.File;
import java.io.FileOutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;
import java.util.List;
public class CreateWordRTLParagraphsFromFile {
public static void main(String[] args) throws Exception {
List<String> lines = Files.readAllLines(new File("HebrewTextFile.txt").toPath(), StandardCharsets.UTF_8);
XWPFDocument doc= new XWPFDocument();
for (String line : lines) {
XWPFParagraph paragraph = doc.createParagraph();
CTP ctp = paragraph.getCTP();
CTPPr ctppr = ctp.getPPr();
if (ctppr == null) ctppr = ctp.addNewPPr();
ctppr.addNewBidi().setVal(STOnOff.ON);
XWPFRun run = paragraph.createRun();
run.setText("\u202E" + line + "\u202C");
}
FileOutputStream out = new FileOutputStream("WordDocument.docx");
doc.write(out);
out.close();
doc.close();
}
}
Using apache poi 5.0.0
for Bidi
.setVal(STOnOff.ON)
不太可能,但 .setVal(true)
可以使用:
Using apache poi 5.0.0
for Bidi
.setVal(STOnOff.ON)
is not more possible but .setVal(true)
can be used:
//ctppr.addNewBidi().setVal(STOnOff.ON); // up to apache poi 4.1.2
ctppr.addNewBidi().setVal(true); // from apache poi 5.0.0 on
这篇关于使用 Aphace POI 双向使用 word 文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!