如何在Ruby中将字符串转换为字节? [英] How to convert string to bytes in Ruby?
问题描述
如何扩展 String 类,并附加一个名为 to_bytes
的方法?
How do I extend the String class, and attach a method named to_bytes
?
推荐答案
Ruby 已经有一个 String#each_byte
方法,它的别名为 String#bytes
.
Ruby already has a String#each_byte
method which is aliased to String#bytes
.
在 Ruby 1.9 之前,字符串等同于字节数组,即假设一个字符是单个字节.这适用于 ASCII 文本和各种文本编码,例如 Win-1252 和 ISO-8859-1 但在 Unicode,我们在网络上越来越多地看到它.Ruby 1.9+ 支持 Unicode,字符串不再被视为由字节组成,而是由字符组成,字符可以是多个字节.
Prior to Ruby 1.9 strings were equivalent to byte arrays, i.e. a character was assumed to be a single byte. That's fine for ASCII text and various text encodings like Win-1252 and ISO-8859-1 but fails badly with Unicode, which we see more and more often on the web. Ruby 1.9+ is Unicode aware, and strings are no longer considered to be made up of bytes, but instead consist of characters, which can be multiple bytes long.
因此,如果您尝试将文本作为单个字节进行操作,则需要确保您的输入是 ASCII,或者至少是基于单字节的字符集.如果您可能有多字节字符,您应该使用 String#each_char
或 String.split(//)
或 String.unpack
和 <代码>U 标志.
So, if you are trying to manipulate text as single bytes, you'll need to ensure your input is ASCII, or at least a single-byte-based character set. If you might have multi-byte characters you should use String#each_char
or String.split(//)
or String.unpack
with the U
flag.
String.split(//)
//
与使用 ''
相同.要么告诉 split
返回字符.您通常也可以使用 chars代码>
.
//
is the same as using ''
. Either tells split
to return characters. You can also usually use chars
.
这篇关于如何在Ruby中将字符串转换为字节?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!