是否有一个完整和正确的Unicode支持的编程语言? [英] Is there a programming language with full and correct Unicode support?

查看:134
本文介绍了是否有一个完整和正确的Unicode支持的编程语言?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大多数编程语言都支持Unicode,但都有一些或多或少的文件记录,其中的事情无法正常工作。






示例



Java: / strong>
在StringBuilder / StringBuffer中的reverse()正常工作。但是,如果一个字符需要超过16位编码,String中的length(),charAt()等不会。



C#:
没有找到正确的反向方法,长度和索引访问返回错误的结果。



Perl:同样的问题。 >

PHP:
根本不了解Unicode,mbstring有更好的工作替代品。






我想知道是否有编程语言,它具有完整和正确的Unicode支持?
为了达到这样的目的,必须做出什么妥协?




  • 更复杂的算法?

  • 更高的内存消耗?

  • 性能下降?






  • Ints,Linked Lists等数组

  • 额外缓存



我发现Python 3在这方面有了很大的变化。

解决方案

看起来Perl 6得到了很好的Unicode支持:

$ icle icle icle / 5-to-6#post_17



例如,它为您提供了三种不同长度的方法:




  • 字节(字节数)

  • 代码(代码点数量)

  • 图(图形量)



这也被整合到Perl的正则表达式中。



>看起来像向我走向正确的方向。


Most programming languages have some support for Unicode, but all have some more or less documented corner cases, where things won't work correctly.


Examples

Java: reverse() in StringBuilder/StringBuffer work correctly. But length(), charAt(), etc. in String do not if a character needs more than 16bit to encode.

C#: Didn't find a correct reverse method, Length and indexed access return wrong results.

Perl: Same problem.

PHP: Does not have an idea of Unicode at all, mbstring has some better working replacements.


I wonder if there is a programming language, which has full and correct Unicode support? What compromises had to be made there to achieve such a thing?

  • More complex algorithms?
  • Higher memory consumption?
  • Slower performance?

How was it implemented internally?

  • Array of Ints, Linked Lists, etc.
  • Additional buffering

I saw that Python 3 had some pretty big changes in this area. How close is Python 3 now to a correct implementation?

解决方案

It looks like Perl 6 gets good Unicode support:

perlgeek.de/en/article/5-to-6#post_17

For instance it provides you with three different length methods:

  • bytes (amount of bytes)
  • codes (amount of codepoints)
  • graphs (amount of graphemes)

This gets integrated into Perl's regular expressions as well.

Looks like a step into the right direction to me.

这篇关于是否有一个完整和正确的Unicode支持的编程语言?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆