其中PDF生成API(Java)的支持古吉拉特语字体? [英] Which PDF Generation API (Java) supports Gujarati Font?

查看:425
本文介绍了其中PDF生成API(Java)的支持古吉拉特语字体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我都试过iText的,PDFBox的&安培; Oracle表单。而且我也赢得成功在iText的情况下产生古吉拉特语PDF文件。但不幸的是它不是在古吉拉特语产生适当的字体(UTF-8)语言。

I have tried iText, PDFBox & Oracle Forms. And I also succed in case of iText to generate Gujarati PDF Document. But, unfortunately it is not generating proper Font in Gujarati (UTF-8) language.

我有我在JDK 1.4安培项目;这是强制性的使用。所以,我需要的API的旧版本支持古吉拉特语字体。

I have my project in jdk 1.4 & that is mandatory to use. So, I need older version of API that support Gujarati Font.

请建议如果任何选项。

样品code:

public void GeneratePDFusingiText(String lStrGujaratidata)
  {
    try
    {

      BaseFont bf = BaseFont.createFont("C:\\Windows\\Fonts\\Shruti.ttf",  BaseFont.IDENTITY_H, BaseFont.NOT_EMBEDDED);
      Font font = new Font(bf, 12);
      Document document = new Document();
      PdfWriter.getInstance(document, new FileOutputStream("D:/GeneratePDFusingiText.pdf"));
      document.open();
      document.add(new Paragraph(lStrGujaratidata, font));
      document.close();
    }
    catch(Exception e)
    {
      System.out.println("Exception while generating PDF");
      e.printStackTrace();
    }
   } 

编辑1:

也许是没有得到显示图像。它被上传 rel=\"nofollow\">。

Perhaps the image is not getting displayed. It is uploaded here.

编辑2:

步骤1)我输入一个字符串古吉拉特语谷歌音译。

Step-1) I type a gujarati String Google Transliterate.

步骤2)我使用BableMap使用软件捆绑国土资源利用其转换入单code。

Step-2) I convert it into unicode using BableMap Software to use it using Resourse Bundle.

问题:让我有一个字符串:બિલાડી(Biladi)

Issue: Let me have a String: બિલાડી (Biladi)

这是UNI code是:\\ u0AAC \\ u0ABF \\ u0AB2 \\ u0ABE \\ u0AA1 \\ u0AC0

It's unicode will be : \u0AAC \u0ABF\u0AB2\u0ABE\u0AA1\u0AC0

检查上面的大胆统一code字符。这就是我得到的问题。现在,如果我改变这种单向code为 \\ u0ABF \\ u0AAC \\ u0AB2 \\ u0ABE \\ u0AA1 \\ u0AC0,它打印的PDF适当的输出。

Check the Bold Unicode character above. That is where I am getting the problem. Now if I change this unicode to \u0ABF\u0AAC\u0AB2\u0ABE\u0AA1\u0AC0 , it prints proper output in PDF.

同时它打印错误输出HTML即:િબલાડી

At the same time it prints wrong output in HTML i.e. : િબલાડી

我要在它们之间管理

我一直在使用谷&放审判; gu.UTF-8与& UTF-8。但是,每次我收到相同的输出。

I have tried using "gu" & "gu.UTF-8" & "UTF-8". But, everytime I am getting same output.

推荐答案

更新答案

您的评论后,我意识到我错了,即音调符号的字符的的第二次出现在字节序列,即使它应该是的渲染的离开了主角

After your comment I realised that I was wrong, i.e. the diacritic character should appear second in the byte sequence, even though it should be rendered left of the main character.

所以,事实证明,iText的不支持这种类型的渲染上印度语charactersets的。粗略地说,iText的使用AWT的的Graphics2D 渲染非拉丁UNI code字符,一个接一个,如PDF图像。 (我猜这是因为相应的字体不一定是每个人的计算机上安装)。此功能不会采取这种特殊的顺序考虑。

So, it turns out, iText doesn't support this type of rendering on Indic charactersets. Roughly speaking, iText uses awt's Graphics2D to render non-Latin unicode characters, one-by-one, as images in the PDF. (I guess this is because appropriate fonts are not necessarily be installed on everyone's computer). This feature doesn't take this special ordering into account.

iText的不支持阿拉伯语类似的行为,使用贡献的另一个开发的类。见<一href=\"http://itext.svn.sourceforge.net/viewvc/itext/trunk/itext/src/main/java/com/itextpdf/text/pdf/ArabicLigaturizer.java?revision=5075&content-type=text/plain\"相对=nofollow> com.itextpdf.text.pdf.ArabicLigaturizer 。也许你可以自己创建一个类似? (!)

iText does support similar behaviour for Arabic, using a class contributed by another developer. See com.itextpdf.text.pdf.ArabicLigaturizer. Perhaps you could create a similar one yourself? (!)

它看起来像这样已经到来之前:

It looks like this has come up before:

  • http://thread.gmane.org/gmane.comp.java.lib.itext.general/56702/focus=59552
  • http://itext-general.2136553.n4.nabble.com/patch-for-complex-scripts-indic-rendering-td2167588.html

原来的答案

克姆chho,

我相信,iText的是显示正确的字符,但你输入的第2个字符已经翻转你翻译的串入UNI code点之前。所以,问题发生之前的数据甚至已经开始iText的。

I believe that iText is displaying the correct characters, but the first 2 characters of your input have been 'flipped' before you translated the string into unicode points. So, the problem occurred before the data even gets to iText.

根本的问题是,第一个字符是'pre-基地的性格,这是一种类型的变音符号。这是一个有点像欧洲的文本口音,因为它不能对自己的存在,其目的是为了美化另一个字符。在这种情况下,它会变成'巴'(બ)成'毕'。

The underlying issue is that the 'first' character is a 'pre-base' character, which is a type of Diacritic. It's a bit like an 'accent' in European texts, in that it can't exist on its own, and its purpose is to embellish another character. In this case it turns a 'Ba' (બ) into a 'Bi'.

您将看到的int的统一code codePAGE,第一个字符(િ)的确是$ C $连接点\\ u0ABF,第二个(બ)是\\ u0AAC:的 http://en.wikipedia.org/wiki/Gujar%C4%81ti_script#Uni code

You'll see int the the Unicode Codepage, that the first character (િ) is indeed codepoint \u0ABF, and the second (બ) is \u0AAC : http://en.wikipedia.org/wiki/Gujar%C4%81ti_script#Unicode

所以,谷歌的地方和音译您的$ C $口岸系统重新presentation之间,这些字符得到了翻转。所以,你需要检查你是怎么做的翻译。

So, somewhere between Google Transliterate and your codepoint representation, these characters got flipped. So, you need to review how you did that translation.

您是如何将这些字符转换为codepoints?

貌似,一些跨preters放置'pre-基地后的主辅音,而不是之前:

Seemingly, some interpreters place the 'pre-base' after the main consonant, instead of before it:


  • 请注意,当这些字符粘贴到(Linux)的终端,
    第2个字符出来后端到前端。我相信的东西
    类似的事情对你来说太。

  • 您还会注意到,当你尝试
    编辑在谷歌音译这个词,你不能把光标之间的
    第2个字符,当你打退格键,左
    字符右边之前删除。

所以,如果你能在哪里工作这个翻转发生,然后希望你的解决方案将present本身。

So, if you can work out where this 'flipping' occured, then hopefully your solution will present itself.

希望这有助于

这篇关于其中PDF生成API(Java)的支持古吉拉特语字体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆