为什么使用 Arial Unicode MS 无法正确呈现 Gujarati-Indian 文本? [英] Why is the Gujarati-Indian text not rendered correctly using Arial Unicode MS?

查看:25
本文介绍了为什么使用 Arial Unicode MS 无法正确呈现 Gujarati-Indian 文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是这个问题的后续

看起来不一样

<块引用>

હિપ્સ

我对 itextpdf-5.5.4.jaritextpdf-5.5.9.jaritext-2.1.7.js3.jar 进行了测试(与 jasper-reports 一起分发)

使用它的字体与 MS Office 一起分发 ARIALUNI.TTF 可以从这里下载 Arial Unicode MS *也许下载有一些法律问题,请参阅 Mike 'Pomax' Kamermans 评论

解决方案

iText5 和 iText2(顺便说一句,这是一个非常过时的版本)都不支持呈现印度语脚本,无论您选择哪种字体.

呈现印度文字与任何拉丁文字都不相似,因为应该采取一系列额外的操作来获得正确的结果,例如根据语言规则,有些字符需要先重新排序.

这是 iText 公司的一个已知问题.

在 iText5 中有一个 Gujaranti 的存根实现,叫做 GujaratiLigaturizer,但是实现真的很差,你不能指望用它得到正确的结果.

您可以尝试使用此连字器处理您的字符串,然后按以下方式输出结果字符串:

IndicLigaturizer g = new GujaratiLigaturizer();字符串处理 = g.process(inputString);//继续处理后的字符串

This is a follow-up on this question How to export fonts in Gujarati-Indian Language to pdf?, @amedee-van-gasse, QA Engineer at iText asked me to post a question specific to itext with relevant mcve.

Why is this sequence of unicode u0ab9u0abfu0aaau0acdu0ab8 not rendered correctly?

It should be rendered like this:

હિપ્સ , also tested with unicode-converter

However this code (example adapted form iText: Chapter 11: Choosing the right font)

public class FontTest {

    /** The resulting PDF file. */
    public static final String RESULT = "fontTest.pdf";
    /** the text to render. */
    public static final String TEST = "u0ab9u0abfu0aaau0acdu0ab8";

    public void createPdf(String filename) throws IOException, DocumentException {
        Document document = new Document();
        PdfWriter writer = PdfWriter.getInstance(document, new FileOutputStream(filename));
        document.open();
        BaseFont bf = BaseFont.createFont(
            "ARIALUNI.TTF", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
        Font font = new Font(bf, 20);
        ColumnText column = new ColumnText(writer.getDirectContent());
        column.setSimpleColumn(36, 730, 569, 36);
        column.addElement(new Paragraph(TEST, font));
        column.go();
        document.close();
        System.out.println("DONE");
    }

    public static void main(String[] args) throws IOException, DocumentException {
        new FontTest().createPdf(RESULT);
    }
}

Generates this result:

That looks different from

હિપ્સ

I have test with itextpdf-5.5.4.jar,itextpdf-5.5.9.jar and also itext-2.1.7.js3.jar (distributed with jasper-reports)

The font used it the one distributes with MS Office ARIALUNI.TTF and it can be download from here Arial Unicode MS *Maybe there are some legal issues downloading see Mike 'Pomax' Kamermans comment

解决方案

Neither iText5 nor iText2 (which is a very outdated version by the way) support rendering of Indic scripts, no matter which font you select.

Rendering Indic scripts is not similar to any Latin scripts, because a long series of additional actions should be taken to get the correct result, e.g. some characters need to be reordered first according to the language rules.

This is a known issue to iText company.

There is a stub implementation for Gujaranti in iText5 called GujaratiLigaturizer, but the implementation is really poor and you cannot expect to get correct results with it.

You can try to process your string with this ligaturizer and then output the resultant string in the following way:

IndicLigaturizer g = new GujaratiLigaturizer();
String processed = g.process(inputString);
// proceed with the processed string

这篇关于为什么使用 Arial Unicode MS 无法正确呈现 Gujarati-Indian 文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆