Lua是否支持Unicode? [英] Does Lua support Unicode?

查看:336
本文介绍了Lua是否支持Unicode?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基于下面的链接,我对于Lua编程语言是否支持Unicode感到困惑.

Based on the link below, I'm confused as to whether the Lua programming language supports Unicode.

http://lua-users.org/wiki/LuaUnicode

看起来确实可以,但是有局限性.我只是不明白,限制有什么大/关键或不重要的地方?

It appears it does but has limitations. I simply don't understand, are the limitation anything big/key or not a big deal?

推荐答案

您当然可以在uta中存储 unicode字符串,如utf8.您可以像使用任何字符串一样使用它们.

You can certainly store unicode strings in lua, as utf8. You can use these as you would any string.

但是,Lua不提供对此类字符串的更高级别的支持unicode感知"操作的默认支持-例如,计算字符的字符串长度,将小写转换为大写等等.您真的取决于要对这些字符串进行什么处理.

However Lua doesn't provide any default support for higher-level "unicode aware" operations on such strings—e.g., counting string length in characters, converting lower-to-upper-case, etc. Whether this lack is meaningful for you really depends on what you intend to do with these strings.

可能的方法,取决于您的用途:

Possible approaches, depending on your use:

  1. 如果只想输入/输出/存储字符串,并且通常将它们用作整个单位"(用于表索引等),则可能根本不需要任何特殊处理.在这种情况下,您只需将这些字符串视为二进制Blob.

  1. If you just want to input/output/store strings, and generally use them as "whole units" (for table indexing etc), you may not need any special handling at all. In this case, you just treat these strings as binary blobs.

由于utf8的巧妙设计,可以对包含utf8的字符串进行某些类型的字符串处理,并且无需特别注意即可产生正确的结果.

Due to utf8's clever design, some types of string manipulation can be done on strings containing utf8 and will yield the correct result without taking any special care.

例如,您可以追加字符串,在ascii字符之前/之后将它们分开,等等.例如,如果您有字符串"開発.txt",并且搜索".在使用string.find (string_var, ".")的字符串中,然后使用普通的string.sub函数将其拆分为"開発"".txt",即使您未使用任何类型的可识别Unicode的代码",这些结果字符串也将是正确的utf8字符串算法.

For instance, you can append strings, split them apart before/after ascii characters, etc. As an example, if you have a string "開発.txt" and you search for "." in that string using string.find (string_var, "."), and then split it using the normal string.sub function into "開発" and ".txt", those result strings will be correct utf8 strings even though you're not using any kind of "unicode-aware" algorithm.

类似地,您可以仅对字符串中的ASCII字符(那些高位为零的ASCII字符)进行大小写转换,并将其余字符串视为二进制而不用拧紧它们.

Similarly, you can do case-conversions on only the ASCII characters in strings (those with the high bit zero), and treat the rest of the strings as binary without screwing them up.

一些支持utf8的操作是如此简单,以至于只需编写自己的函数即可进行操作.

Some utf8-aware operations are so simple that it's easy to just write one's own functions to do them.

例如,要计算字符串的Unicode字符长度,只需计算高位零(ASCII字符)的字符数和高两位的字符数11(字节"(用于非ASCII字符);长度是这两个的总和.

For instance, to calculate the length in unicode-characters of a string, just count the number of characters with the high bit zero (ASCII characters), and the number of characters with the top two bits 11 ("leading bytes" for non-ASCII characters); the length is the sum of those two.

要执行更复杂的操作-例如,对非ASCII字符进行大小写转换等-您可能必须使用Lua unicode库,例如(前面提到的) Lua用户Unicode页面

For more complex operations—e.g., case-conversion on non-ASCII characters, etc.—you'll probably have to use a Lua unicode library, such as those on the (previously mentioned) Lua-users Unicode page

这篇关于Lua是否支持Unicode?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆