ICU自定义音译 [英] ICU custom transliteration

查看：55 发布时间：2021/9/9 19:16:28 unicode transform icu transliteration

本文介绍了ICU自定义音译的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我希望使用 ICU 库进行音译，但我想为一组特定的自定义音译提供自定义音译文件，以便在编译时合并到 ICU 核心中，以便在其他地方以二进制形式使用.出于兼容性原因，我正在使用 ICU 4.2 的源代码.

I am looking to utilize the ICU library for transliteration, but I would like to provide a custom transliteration file for a set of specific custom transliterations, to be incorporated into the ICU core at compile time for use in binary form elsewhere. I am working with the source of ICU 4.2 for compatibility reasons.

据我所知，从他们网站的 ICU 数据页面，一种方式关于这个是在 ICUHOME/source/data/translit/中创建文件 trnslocal.mk ，在这个文件中有一行 TRANSLIT_SOURCE_LOCAL=custom.txt.

As I understand it, from the ICU Data page of their website, one way of going about this is to create the file trnslocal.mk within ICUHOME/source/data/translit/ , and within this file have the single line TRANSLIT_SOURCE_LOCAL=custom.txt.

对于custom.txt 文件本身，我使用了以下格式，基于主文件root.txt:

For the custom.txt file itself, I used the following format, based on the master file root.txt:

custom{
    RuleBasedTransliteratorIDs {
            Kanji-Romaji {
            file {
              resource:process(transliterator){"custom/Kanji_Romaji.txt"}
              direction{"FORWARD"}
            }
         }
    }
    TransliteratorNamePattern {
        // Format for the display name of a Transliterator.
        // This is the language-neutral form of this resource.
        "{0,choice,0#|1#{1}|2#{1}-{2}}" // Display name
    }
    // Transliterator display names
    // This is the English form of this resource.
    "%Translit%Hex"         { "%Translit%Hex" }
    "%Translit%UnicodeName" { "%Translit%UnicodeName" }
    "%Translit%UnicodeChar" { "%Translit%UnicodeChar" }
    TransliterateLATIN{        
        "",
        ""
    }
}

然后我将文件 Kanji_Romaji.txt 存储在目录 custom 中，如发现这里.因为它使用 > 而不是我在其他文件中看到的 →，我适当地转换了每个条目，所以它们现在看起来像:

I then store within the directory custom the file Kanji_Romaji.txt, as found here. Because it uses > instead of the → I have seen in other files, I converted each entry appropriately, so they now look like:

丁 → Tei ;
七 → Shichi ;

当我编译 ICU 项目时，没有出现任何错误.

When I compile the ICU project, I am presented with no errors.

然而，当我尝试在测试文件中使用这个自定义音译器时(一个与内置音译器配合良好的测试文件)，我遇到了错误 error: 65569:U_INVALID_ID.

When I attempt to utilize this custom transliterator within a testfile, however (a testfile that works fine with the in-built transliterators), I am met with the error error: 65569:U_INVALID_ID.

我正在使用以下代码构建音译器并输出错误:

I am using the following code to construct the transliterator and output the error:

UErrorCode status = U_ZERO_ERROR;
Transliterator *K_R = Transliterator::createInstance("Kanji-Romaji", UTRANS_FORWARD, status);
if (U_FAILURE(status))
{
std::cout << "error: " << status << ":" << u_errorName(status) << std::endl;
return 0;
}

此外，循环到 Transliterator::countAvailableIDs() 和 Transliterator::getAvailableID(i) 不会列出我的自定义音译.我记得读过关于自定义转换器的内容，它们必须在/source/data/mappings/convrtrs.txt 中注册.有没有类似的音译文件?

Additionally, a loop through to Transliterator::countAvailableIDs() and Transliterator::getAvailableID(i) does not list my custom transliteration. I remember reading with regard to custom converters that they must be registered within /source/data/mappings/convrtrs.txt . Is there a similar file for transliterators?

我的自定义音译器似乎没有被内置到适当的包中(尽管没有编译错误)，格式不正确，或者没有注册使用.顺便说一句，我知道运行时的 RuleBasedTransliterator 路由，但我希望能够编译自定义音译以在任何生成的二进制文件中使用.

It seems that my custom transliterator is either not being built into the appropriate packages (though there are no compile errors), is improperly formatted, or somehow not being registered for use. Incidentally, I am aware of the RuleBasedTransliterator route at runtime, but I would prefer to be able to compile the custom transliterations for use in any produced binary.

如果需要任何其他说明，请告诉我.我知道这里至少有一个 ICU 程序员，他在我在其他地方写过和看到的其他帖子中也很有帮助.我很感激我能找到的任何帮助.提前致谢！

Let me know if any additional clarification is necessary. I know there is at least one ICU programmer on here, who has been quite helpful in other posts I have written and seen elsewhere as well. I would appreciate any help I can find. Thank you in advance!

ICU自定义音译 [英] ICU custom transliteration

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

ICU自定义音译 [英] ICU custom transliteration

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭