访问字符串的随机符文元素而不用于......范围 [英] Access random rune element of string without using for ... range

查看:149
本文介绍了访问字符串的随机符文元素而不用于......范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近问了这个问题和答案增加了我的理解,但他们并没有解决我的实际问题。所以,我会试着问一个类似但不同的问题如下。假设我想访问一个字符串的随机符文元素,

$ C>。一种方法是:

pre $ func RuneElement(str string,idx int)rune {
var ret rune
for i,c:= range str {
if i == idx {
return c
}
}
return ret //超出范围 - >需要适当的处理

如果我想调用这样一个函数很多次?我想我正在寻找的就像一个操作符/函数,如 str [i] (它返回一个 byte )它返回位于 i -th位置的符文元素。为什么可以使用作为...... range 而不是像 str.At(i)那样通过funtcion访问这个元素。例如?

解决方案Go的字符串类型存储UTF-8编码字节序列的文本。这是一个已经做出的设计决定,它不会改变。



如果您想高效地获取符文在任意索引处,你必须解码字节,你不能做任何事情( for ... range 完成这个解码)。没有捷径。



如果您必须经常/多次执行此操作,则应更改输入内容,而不要使用 string ,但是一个 []符文,因为它是一个切片并且可以被有效地索引。 Go中的 string 不是 [] rune 。 Go中的 string 实际上是只读的 [] byte (UTF-8)。如果你不能改变输入类型,你可以建立一个从 string 映射到它的 []符文

  var cache = map [string] [ ]符文{} 

符文RuneAt(s string,idx int)符文{
rs:= cache [s]
if rs == nil {
rs = []符文
cache [s] = []符文
}
如果idx> = len(rs){
return 0
}
return rs [idx]
}

是值得的:如果 RuneAt()被一小部分 string s调用,这可能会提高性能。如果传递的字符串或多或少是唯一的,则会导致性能下降和大量内存使用。此外,此实现对于并发使用不安全。


I recently asked this question and the answers increased my understanding, but they didn't solve the actual problem I had. So, I will try to ask a similar but different question as follows.

Suppose that I want to access random rune element of a string. One way is:

func RuneElement(str string, idx int) rune {
  var ret rune
  for i, c := range str {
    if i == idx {
      return c
    }
  }
  return ret // out of range -> proper handling is needed
}

What if I want to call such a function a lot of times? I guess what I am looking for is like an operator/function like str[i] (which returns a byte) that return the rune element at i-th position. Why this element can be accessed using for ... range but not through a funtcion like str.At(i) for example?

解决方案

Go's string type stores the UTF-8 encoded byte sequence of the text. This is a design decision that has been made and it won't change.

If you want to efficiently get a rune from it at an arbitrary index, you have to decode the bytes, you can't do anything about that (the for ... range does this decoding). There is no "shortcut". The chosen representation just doesn't provide this out of the box.

If you have to do this frequently / many times, you should change your input and not use string but a []rune, as it's a slice and can be efficiently indexed. string in Go is not []rune. string in Go is effectively a read-only []byte (UTF-8). Period.

If you can't change the input type, you may build an internal cache mapped from string to its []rune:

var cache = map[string][]rune{}

func RuneAt(s string, idx int) rune {
    rs := cache[s]
    if rs == nil {
        rs = []rune(s)
        cache[s] = []rune(s)
    }
    if idx >= len(rs) {
        return 0
    }
    return rs[idx]
}

It depends on case whether this is worth it: if RuneAt() is called with a small set of strings, this may improve performance a lot. If the passed strings are more-or-less unique, this will result in worse performance and a lot of memory usage. Also this implementation is not safe for concurrent use.

这篇关于访问字符串的随机符文元素而不用于......范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆