在不使用 for ... 范围的情况下访问字符串的随机符文元素 [英] Access random rune element of string without using for ... range
问题描述
我最近问了 这个问题和答案增加了我的理解,但他们并没有解决我遇到的实际问题.因此,我将尝试提出一个类似但不同的问题,如下所示.
I recently asked this question and the answers increased my understanding, but they didn't solve the actual problem I had. So, I will try to ask a similar but different question as follows.
假设我想访问 string
的随机 rune
元素.一种方法是:
Suppose that I want to access random rune
element of a string
. One way is:
func RuneElement(str string, idx int) rune {
var ret rune
for i, c := range str {
if i == idx {
return c
}
}
return ret // out of range -> proper handling is needed
}
如果我想多次调用这样的函数怎么办?我想我正在寻找的类似于返回 rune
元素的运算符/函数,例如 str[i]
(返回 byte
)在 i
-th 位置.为什么这个元素可以使用 for ... range
而不是通过像 str.At(i)
这样的函数来访问?
What if I want to call such a function a lot of times? I guess what I am looking for is like an operator/function like str[i]
(which returns a byte
) that return the rune
element at i
-th position. Why this element can be accessed using for ... range
but not through a funtcion like str.At(i)
for example?
推荐答案
string
中的值存储文本的 UTF-8 编码字节序列.这是一个已经做出的设计决定,不会改变.
string
values in Go store the UTF-8 encoded byte sequence of the text. This is a design decision that has been made and it won't change.
如果您想在任意索引处有效地从中获取 rune
,则必须解码字节,对此您无能为力( for ... 范围
进行此解码).没有捷径".所选择的表示并没有提供开箱即用的功能.
If you want to efficiently get a rune
from it at an arbitrary index, you have to decode the bytes, you can't do anything about that (the for ... range
does this decoding). There is no "shortcut". The chosen representation just doesn't provide this out of the box.
如果你必须经常/多次这样做,你应该改变你的输入,而不是使用 string
而是使用 []rune
,因为它是一个切片并且可以有效地索引.Go 中的 string
不是 []rune
.Go 中的 string
实际上是只读的 []byte
(UTF-8).期间.
If you have to do this frequently / many times, you should change your input and not use string
but a []rune
, as it's a slice and can be efficiently indexed. string
in Go is not []rune
. string
in Go is effectively a read-only []byte
(UTF-8). Period.
如果你不能改变输入类型,你可以建立一个内部缓存,从 string
映射到它的 []rune
:
If you can't change the input type, you may build an internal cache mapped from string
to its []rune
:
var cache = map[string][]rune{}
func RuneAt(s string, idx int) rune {
rs := cache[s]
if rs == nil {
rs = []rune(s)
cache[s] = []rune(s)
}
if idx >= len(rs) {
return 0
}
return rs[idx]
}
这取决于情况是否值得:如果 RuneAt()
用一小部分 string
调用,这可能会大大提高性能.如果传递的字符串或多或少是唯一的,这将导致更差的性能和大量的内存使用.此外,此实现对于并发使用也不安全.
It depends on case whether this is worth it: if RuneAt()
is called with a small set of string
s, this may improve performance a lot. If the passed strings are more-or-less unique, this will result in worse performance and a lot of memory usage. Also this implementation is not safe for concurrent use.
这篇关于在不使用 for ... 范围的情况下访问字符串的随机符文元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!