如何从 Ruby 中的较大字符串中提取单个字符(作为字符串)? [英] How to extract a single character (as a string) from a larger string in Ruby?

查看:31
本文介绍了如何从 Ruby 中的较大字符串中提取单个字符(作为字符串)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从字符串中检索单个字符作为单字符字符串的 Ruby 惯用方法是什么?当然有 str[n] 方法,但是(从 Ruby 1.8 开始)它返回一个字符代码作为 fixnum,而不是字符串.你怎么得到一个单字符的字符串?

What is the Ruby idiomatic way for retrieving a single character from a string as a one-character string? There is the str[n] method of course, but (as of Ruby 1.8) it returns a character code as a fixnum, not a string. How do you get to a single-character string?

推荐答案

在 Ruby 1.9 中,这很容易.在 Ruby 1.9 中,字符串是可识别编码的字符序列,因此您只需对其进行索引即可从中获得单字符字符串:

In Ruby 1.9, it's easy. In Ruby 1.9, Strings are encoding-aware sequences of characters, so you can just index into it and you will get a single-character string out of it:

'µsec'[0] => 'µ'

然而,在 Ruby 1.8 中,字符串是字节序列,因此完全不知道编码.如果您索引到一个字符串并且该字符串使用多字节编码,则您可能会直接索引到多字节字符的中间(在此示例中,'µ' 以 UTF-8 编码):

However, in Ruby 1.8, Strings are sequences of bytes and thus completely unaware of the encoding. If you index into a string and that string uses a multibyte encoding, you risk indexing right into the middle of a multibyte character (in this example, the 'µ' is encoded in UTF-8):

'µsec'[0] # => 194
'µsec'[0].chr # => Garbage
'µsec'[0,1] # => Garbage

但是,Regexps 和一些专门的字符串方法至少支持一小部分流行编码,其中包括一些日语编码(例如 Shift-JIS)和(在本例中)UTF-8:

However, Regexps and some specialized string methods support at least a small subset of popular encodings, among them some Japanese encodings (e.g. Shift-JIS) and (in this example) UTF-8:

'µsec'.split('')[0] # => 'µ'
'µsec'.split(//u)[0] # => 'µ'

这篇关于如何从 Ruby 中的较大字符串中提取单个字符(作为字符串)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆