根据旧闪族语言的字母对单词进行排序 [英] Sorting words according to letters of an old Semitic language
问题描述
我使用 XSLT 3.0、Saxon-PE 9.7.
我需要根据乌加里特语对 orth
进行排序,接近希伯来语,但有额外的字符.
我试过了:
但是建议的顺序不正确.所以我想我需要描述乌加里特字母顺序.我能怎么做?
提前,非常感谢.
Saxon 允许你在它的配置文件中定义你自己的排序规则,你基本上必须设置一个像
这样的部分的配置文件 <排序规则><collation uri="http://example.com/uga-trans"rules="< ʾa < b < g < ḫ < d < h < w < z < ḥ < ṭ < y < k < š < l < m < ḏ < n < ẓ < s < ʿ < p < ṣ < q < r < ṯ < ġ < t < ʾi < ʾu < s2"/></排序规则>
其中 uri
属性将 URI 定义为您的排序规则的名称,然后您可以在 xsl:sort
的 collation
属性中使用该名称>:
<xsl:sort select="string()" collation="http://example.com/uga-trans"/></xsl:perform-sort>
在rules
属性中使用的语法是为Java 类RuleBasedCollator
定义的语法https://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html,它有一个挪威语的例子.唯一需要注意的是 Java 语法是纯文本,而 Saxon 配置是 XML,因此定义排序的 <
必须在 rules
属性中转义为 <
.
我根据维基百科文章https://en.wikipedia.org/wiki/Ugaritic_alphabet.我不确定这是否是您要找的那个.
您可以使用 -config:yourconfiguationfile.xml
从命令行运行 Saxon 以使用这样的配置,oXygen 在 Saxon 特定转换场景对话框中有一个字段来选择配置文件.>
I use XSLT 3.0, Saxon-PE 9.7.
I need to sort orth
according to the Ugaritic language, close to Hebrew but with additional characters.
I have tried:
<xsl:sort select="orth" data-type="text" order="ascending" lang="uga"/>
But the proposed order is not correct. So I think I need to describe the Ugaritic alphabetic order. How can I do?
In advance, thank you very much.
Saxon allows you to define your own collation in its configuration file, you basically have to set up a configuration file with a section like
<collations>
<collation uri="http://example.com/uga-trans"
rules="< ʾa < b < g < ḫ < d < h < w < z < ḥ < ṭ < y < k < š < l < m < ḏ < n < ẓ < s < ʿ < p < ṣ < q < r < ṯ < ġ < t < ʾi < ʾu < s2"/>
</collations>
where the uri
attribute defines a URI as the name for your collation that you can then use in the collation
attribute of an xsl:sort
:
<xsl:perform-sort select="$input-seq">
<xsl:sort select="string()" collation="http://example.com/uga-trans"/>
</xsl:perform-sort>
The syntax to be used in the rules
attribute is the one defined for the Java class RuleBasedCollator
https://docs.oracle.com/javase/7/docs/api/java/text/RuleBasedCollator.html, it has an example there for Norwegian. The only caveat is that the Java syntax is plain text while the Saxon configuration is XML so the <
to define the ordering has to be escaped in the rules
attribute as <
.
I have set up above a rule based on the transcription sequence presented in the Wikipedia article https://en.wikipedia.org/wiki/Ugaritic_alphabet. Whether that is the one you are looking for I am not sure.
You can run Saxon from the command line with -config:yourconfiguationfile.xml
to use such a configuration, oXygen has a field in the Saxon specific transformation scenario dialog to select a configuration file.
这篇关于根据旧闪族语言的字母对单词进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!