在不使用 for ... 范围的情况下访问字符串的随机符文元素 [英] Access random rune element of string without using for ... range

查看:14
本文介绍了在不使用 for ... 范围的情况下访问字符串的随机符文元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近问了 这个问题和答案增加了我的理解,但他们并没有解决我遇到的实际问题.因此,我将尝试提出一个类似但不同的问题,如下所示.

I recently asked this question and the answers increased my understanding, but they didn't solve the actual problem I had. So, I will try to ask a similar but different question as follows.

假设我想访问 string 的随机 rune 元素.一种方法是:

Suppose that I want to access random rune element of a string. One way is:

func RuneElement(str string, idx int) rune {
  var ret rune
  for i, c := range str {
    if i == idx {
      return c
    }
  }
  return ret // out of range -> proper handling is needed
}

如果我想多次调用这样的函数怎么办?我想我正在寻找的类似于返回 rune 元素的运算符/函数,例如 str[i](返回 byte)在 i-th 位置.为什么这个元素可以使用 for ... range 而不是通过像 str.At(i) 这样的函数来访问?

What if I want to call such a function a lot of times? I guess what I am looking for is like an operator/function like str[i] (which returns a byte) that return the rune element at i-th position. Why this element can be accessed using for ... range but not through a funtcion like str.At(i) for example?

推荐答案

string 中的值存储文本的 UTF-8 编码字节序列.这是一个已经做出的设计决定,不会改变.

string values in Go store the UTF-8 encoded byte sequence of the text. This is a design decision that has been made and it won't change.

如果您想在任意索引处有效地从中获取 rune,则必须解码字节,对此您无能为力( for ... 范围 进行此解码).没有捷径".所选择的表示并没有提供开箱即用的功能.

If you want to efficiently get a rune from it at an arbitrary index, you have to decode the bytes, you can't do anything about that (the for ... range does this decoding). There is no "shortcut". The chosen representation just doesn't provide this out of the box.

如果你必须经常/多次这样做,你应该改变你的输入,而不是使用 string 而是使用 []rune,因为它是一个切片并且可以有效地索引.Go 中的 string 不是 []rune.Go 中的 string 实际上是只读的 []byte (UTF-8).期间.

If you have to do this frequently / many times, you should change your input and not use string but a []rune, as it's a slice and can be efficiently indexed. string in Go is not []rune. string in Go is effectively a read-only []byte (UTF-8). Period.

如果你不能改变输入类型,你可以建立一个内部缓存,从 string 映射到它的 []rune:

If you can't change the input type, you may build an internal cache mapped from string to its []rune:

var cache = map[string][]rune{}

func RuneAt(s string, idx int) rune {
    rs := cache[s]
    if rs == nil {
        rs = []rune(s)
        cache[s] = []rune(s)
    }
    if idx >= len(rs) {
        return 0
    }
    return rs[idx]
}

这取决于情况是否值得:如果 RuneAt() 用一小部分 string 调用,这可能会大大提高性能.如果传递的字符串或多或少是唯一的,这将导致更差的性能和大量的内存使用.此外,此实现对于并发使用也不安全.

It depends on case whether this is worth it: if RuneAt() is called with a small set of strings, this may improve performance a lot. If the passed strings are more-or-less unique, this will result in worse performance and a lot of memory usage. Also this implementation is not safe for concurrent use.

这篇关于在不使用 for ... 范围的情况下访问字符串的随机符文元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆