使用 Aphace POI 双向使用 word 文档 [英] bidirectional with word document using Aphace POI

查看:40
本文介绍了使用 Aphace POI 双向使用 word 文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将一些希伯来语文本添加到 Word 文档中,它工作正常,但是当我添加标点符号时,它变得凌乱.

I am trying to add some Hebrew text into a word document and it's work fine but when I add punctuation it's getting messy.

这是我运行的代码:

public static void main(String[] args) throws Exception {

    XWPFDocument document = new XWPFDocument();
    XWPFParagraph paragraph = document.createParagraph();

    paragraph.setAlignment(ParagraphAlignment.LEFT);

    // make RTL direction
    CTP ctp = paragraph.getCTP();
    CTPPr ctppr;
    if ((ctppr = ctp.getPPr()) == null) {
        ctppr = ctp.addNewPPr();
    }
    ctppr.addNewBidi().setVal(STOnOff.ON);

    XWPFRun run = paragraph.createRun();
    run.setText("שלום עולם !");

    // create the document in the specific path by giving it a name
    File newFile = new File("helloWorld.docx");

    // insert document to newFile
    try {
        FileOutputStream output = new FileOutputStream(newFile);
        document.write(output);
        output.close();
        document.close();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

这是我得到的helloWorld.docx":

This is the "helloWorld.docx" I get:

截图

这就是它需要的样子:

截图

此外,我希望整个文档都是 RTL(即使是双向的),而不仅仅是特定的段落.

Moreover, I want the whole document to be RTL (even with bidirectional) and not just the specific paragraph.

感谢您的帮助!

推荐答案

这是一个众所周知的使用双向文本的问题.感叹号以及空格本身不是从右到左的字符.因此,如果需要,我们需要将它们标记为这样.RIGHT-TO-LEFT MARK (RLM)U+200F.请参阅https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types.

That's a well known problem using bidirectional text. The exclamation mark, as well as the space are not right-to-left characters themselves. So we need mark them as such, if needed. The RIGHT-TO-LEFT MARK (RLM) is U+200F. See https://en.wikipedia.org/wiki/Bidirectional_text#Table_of_possible_BiDi_character_types.

以下代码对我有用:

import java.io.FileOutputStream;

import org.apache.poi.xwpf.usermodel.*;

import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;

public class CreateWordRTLParagraph {

 public static void main(String[] args) throws Exception {

  XWPFDocument doc= new XWPFDocument();

  XWPFParagraph paragraph = doc.createParagraph();
  CTP ctp = paragraph.getCTP();
  CTPPr ctppr;
  if ((ctppr = ctp.getPPr()) == null) ctppr = ctp.addNewPPr();
  ctppr.addNewBidi().setVal(STOnOff.ON);

  XWPFRun run = paragraph.createRun();
  run.setText("שלום עולם \u200F!\u200F");

  FileOutputStream out = new FileOutputStream("WordDocument.docx");
  doc.write(out);
  out.close();
  doc.close();

 }
}

注意 \u200F 标记 之后 空格和感叹号.

Note the \u200F mark after space and exclamation mark.

如果文本行来自文件,那么标记单个字符将不是最佳做法.然后整个文本行应该被标记为从右到左的文本.为此,我们可以将文本行嵌入 U+202E RIGHT-TO-LEFT OVERRIDE (RLO) 后跟 U+202C POP DIRECTIONAL FORMATTING (PDF).

If the text lines are coming from a file, then marking single characters will not be best practice. Then the whole text line should be marked as right-to-left text. To do so we can embed the text lines in a U+202E RIGHT-TO-LEFT OVERRIDE (RLO) followed by a U+202C POP DIRECTIONAL FORMATTING (PDF).

示例:

import java.io.File;
import java.io.FileOutputStream;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;

import org.apache.poi.xwpf.usermodel.*;

import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTP;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTPPr;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.STOnOff;

import java.util.List;

public class CreateWordRTLParagraphsFromFile {

 public static void main(String[] args) throws Exception {

  List<String> lines = Files.readAllLines(new File("HebrewTextFile.txt").toPath(), StandardCharsets.UTF_8);

  XWPFDocument doc= new XWPFDocument();

  for (String line : lines) {

   XWPFParagraph paragraph = doc.createParagraph();
   CTP ctp = paragraph.getCTP();
   CTPPr ctppr = ctp.getPPr();
   if (ctppr == null) ctppr = ctp.addNewPPr();
   ctppr.addNewBidi().setVal(STOnOff.ON);

   XWPFRun run = paragraph.createRun();
   run.setText("\u202E" + line + "\u202C");

  }

  FileOutputStream out = new FileOutputStream("WordDocument.docx");
  doc.write(out);
  out.close();
  doc.close();

 }
}


Using apache poi 5.0.0 for Bidi .setVal(STOnOff.ON) 不太可能,但 .setVal(true) 可以使用:


Using apache poi 5.0.0 for Bidi .setVal(STOnOff.ON) is not more possible but .setVal(true) can be used:

  //ctppr.addNewBidi().setVal(STOnOff.ON); // up to apache poi 4.1.2
  ctppr.addNewBidi().setVal(true); // from apache poi 5.0.0 on

这篇关于使用 Aphace POI 双向使用 word 文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆