使用非ASCII(自然语言)XML标签是否合适? [英] Is it appropriate to use non-ASCII (natural-language) XML tags?

查看:116
本文介绍了使用非ASCII(自然语言)XML标签是否合适?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用以非ASCII自然语言编写的XML标签(元素名称)是否合适? XML规范允许使用它(请参见名称例外),但我找不到 W3C 和相关页面上的任何最佳做法.

Is it appropriate to use XML tags (element names) written in non-ASCII natural languages? The XML spec allows it (see Names and Exceptions), but I couldn't find any best practices about this at W3C and related pages.

我正在寻找的实用建议是有关哪些工具支持此功能的信息,例如与XML相关的重要技术(如XSLT和XForms)是否可能存在问题等等.

What I'm looking for is practical advice regarding which tools support this, whether important XML-related technologies such as XSLT and XForms may have problems with it, etc.

我认为,安德烈(Andrey)和托玛拉克(Tomalak)遗漏了重点. XML不一定由程序员读取,而是由许多不同的专业人员读取.因此,将其与源代码进行比较的论点不一定适用.

I think Andrey and Tomalak are missing the point. XML is not necessarily read by programmers, it is read by many different professionals. So the arguments comparing it to source code don't necessarily apply.

让我澄清一下:我的意思是保加利亚的法律领域,其中许多术语专门针对保加利亚的法律程序,甚至可能没有确切的英语翻译.翻译它们是费力,不精确和不切实际的.音译为ASCII的效果不佳.

Let me clarify: I mean a Bulgarian legal domain, where many terms are specific to the Bulgarian legal process, and may not even have exact English translations. Translating them would be laborious, imprecise and impractical. Transliterating to ASCII is suboptimal.

回到问题所在:我将面对哪些工具限制? (Eclipse支持UTF,因此编写xpath不会有问题.)

So back to the question: what tool limitations would I face? (Eclipse supports UTF, so writing xpaths wouldn't be a problem.)

为了使人们朝着我想要的技术方向迈进:在一些系统中,我们使用了生成技术来确保XML模式,Java bean和数据库模式之间的完美对应.

To get people started in the technical direction that I'd like: in several systems we've used generation techniques to ensure perfect correspondence between XML schemas, Java beans and database schemas.

  • Java: this article says that Unicode is ok
  • Oracle: identifiers can contain only alphanumeric characters from your database character set
  • I'd have to check for the tooling we use (JibX, Dozer, Hibernate, JXPath...)

推荐答案

如果文档的内容为保加利亚语,则标记应该可以.

If the content of the documents will be in Bulgarian then the markup should be able to be.

如果您的工具链无法解析该语言的标签,那么如何确定它正确地处理了内容?

If your tool chain can't parse the tags in that language then how can you be sure that it is handling the content correctly?

无论是金融,遗传学,工程学还是保加利亚法律体系,程序员都将必须学习目标领域的语言.为了程序员的方便而降低可用性几乎总是坏事".无论节省多少精力,最终都会因最终用户的生产力下降以及产品生命周期内的支持工作/成本而流失.

Programmers will always have to learn the language of the target domain, whether it be finance, genetics, engineering or the Bulgarian legal system. Compromising usability for the convenience of the programmer is almost always a 'Bad Thing'. Whatever effort is saved up front ends up getting lost as impeded end user productivity and in support effort/cost over the lifetime of the product.

这篇关于使用非ASCII(自然语言)XML标签是否合适?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆