Lua String char编码 [英] Lua String char encoding

查看:96
本文介绍了Lua String char编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看不到Lua对其字符串使用什么编码.

I cant see what encoding Lua uses for its strings.

我正在使用

string.byte(s [,i [,j]])

具有文档

返回字符s [i],s [i + 1],···,s [j].i的默认值为1;默认值为1.j的默认值为i.请注意,数字代码不一定跨平台.

周围的人建议使用ASCII-这对我来说很好-但是我没有跨平台进行更改-我认为使用单一编码(例如ASCII)的本质是这种情况不会发生-还是仅仅是说这是ASCII不能定义超过126(或127),因此不同的国家/地区/OEM/OS等可能正在使用数十年前的自定义ASCII扩展名来用于上限?

Reading around people suggest it uses ASCII - which is fine for me - but I dont get the changing across platforms - I thought the very nature of using a single encoding (like ASCII) is that this wouldnt happen - or is it just saying this as ASCII does not define for over 126 (or 127) and therefore different countries / OEMS / OSs etc may be using custom ASCII extensions from decades ago for the upper range?

对我来说很重要的一点是,[a-zA-Z]在运行的所有平台上都将具有相同的char值.

Its important for me to know that [a-zA-Z] will have the same char values on all platforms im running on.

Lua文档可能在这里更加具体!

The Lua doc could be a bit more specific here!

任何人都可以照亮的光亮会很好

Any light anyone can shed on this would be great thx

推荐答案

我很确定您可以放心地假定ASCII编码.因此,您感兴趣的微小字符集保持不变.

I'm fairly sure you can safely assume an ASCII-derived encoding. So the minuscule set of characters you're interested in stays the same.

有关平台之间代码更改的注释可能意味着Lua完全不了解字符编码,因此仅使用操作系统分发的任何 bytes .在Linux上,这很可能是UTF-8,这意味着在退出ASCII时,您必须处理单个代码单元.在Windows上,我可以想象它是系统的旧版代码页,这意味着在西方世界的大部分地区都可以使用Latin 1(CP 1252).

The note about the code changing between platforms likely means that Lua doesn't know anything about the character encoding at all and thus just uses whatever bytes the OS hands out. On Linux this is likely UTF-8, which means you'd have to deal with individual code units when stepping outside ASCII. On Windows I could imagine it being the system's legacy codepage, which means sort-of Latin 1 (CP 1252) in much of the Western world.

这篇关于Lua String char编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆