以字节为单位访问字符串元素是否执行转换? [英] Does accessing elements of string as byte perform conversion?

查看:113
本文介绍了以字节为单位访问字符串元素是否执行转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Go中,要访问字符串的元素,我们可以这样写:

  $ str $ {
// str [i]的类型是byte
// c是类型符号

访问 str [i] Go是否执行从符文字节的转换?我猜想答案是肯定的,但我不确定。如果是这样,那么,以下哪一种方法在性能方面更好?是否比另一个更受欢迎(例如,在最佳做法方面)?

  str:=大文本
为i:=范围str {
//使用str [i]
}



  str:=大文本
str2:= [] byte(str)
为_,s:=范围str2 {
//使用s
}


解决方案


以下哪一种方法性能更好?


绝对不是这样的。

  str:=大文本
str2:= [] byte(str)
for _,s:= range str2 {
//使用s
}

字符串是不可变的。 [] byte 是可变的。这意味着 [] byte(str)复制。所以上面会复制整个字符串。我发现不知道字符串何时被复制成大字符串性能问题的主要来源。



如果 str2 永不改变,编译器可以优化拷贝。由于这个原因,最好这样写,以确保字节数组永远不会被改变。

  str:=large文本
for _,s:= range [] byte(str){
//使用s
}

通过这种方式,没有 str2 可能会在稍后被修改并破坏优化。

但这是一个坏主意,因为它会破坏任何多字节字符。见下面。






至于字节/符文转换,性能不是一个考虑因素,因为它们不是等价的。 c 将是一个符文,并且 str [i] 将是一个字节。如果您的字符串包含多字节字符,则必须使用符文。



例如...

<$ p











$ func main(){
str:=snow☃man
for i,c:= range str {
fmt.Printf(c:%c str [i]:%c\\\
,c,str [ i])
}
}

$ go run〜/ tmp / test.go
c:s str [i]:s
c:n str [i]:n
c:o str [i]:o
c:w str [i]:w
c:str [i]:
c:☃str [i]:
c:str [i]:
c:m str [i]:m
c:a str [i]:a
c:n str [i]:n

请注意,使用 str [i] 字节Unicode雪人,它只包含多字节字符的第一个字节。



无论如何,没有任何性能差异,因为 range str 已经必须完成逐个字符的工作,而不是逐个字节。


In Go, to access elements of a string, we can write:

str := "text"
for i, c := range str {
  // str[i] is of type byte
  // c is of type rune
}

When accessing str[i] does Go perform a conversion from rune to byte? I would guess the answer is yes, but I am not sure. If so, then, which one of the following methods are better performance-wise? Is one preferred over another (in terms of best practice, for example)?

str := "large text"
for i := range str {
  // use str[i]
}

or

str := "large text"
str2 := []byte(str)
for _, s := range str2 {
  // use s
}

解决方案

Which one of the following methods are better performance-wise?

Definitely not this.

str := "large text"
str2 := []byte(str)
for _, s := range str2 {
  // use s
}

Strings are immutable. []byte is mutable. That means []byte(str) makes a copy. So the above will copy the entire string. I've found being unaware of when strings are copied to be a major source of performance problems for large strings.

If str2 is never altered, the compiler may optimize away the copy. For this reason, it's better to write the above like so to ensure the byte array is never altered.

str := "large text"
for _, s := range []byte(str) {
  // use s
}

That way there's no str2 to possibly be modified later and ruin the optimization.

But this is a bad idea because it will corrupt any multi-byte characters. See below.


As for the byte/rune conversion, performance is not a consideration as they are not equivalent. c will be a rune, and str[i] will be a byte. If your string contains multi-byte characters, you have to use runes.

For example...

package main

import(
    "fmt"
)

func main() {
    str := "snow ☃ man"
    for i, c := range str {
        fmt.Printf("c:%c str[i]:%c\n", c, str[i])
    }
}

$ go run ~/tmp/test.go
c:s str[i]:s
c:n str[i]:n
c:o str[i]:o
c:w str[i]:w
c:  str[i]: 
c:☃ str[i]:â
c:  str[i]: 
c:m str[i]:m
c:a str[i]:a
c:n str[i]:n

Note that using str[i] corrupts the multi-byte Unicode snowman, it only contains the first byte of the multi-byte character.

There's no performance difference anyway as range str already must do the work to go character-by-character, not byte by byte.

这篇关于以字节为单位访问字符串元素是否执行转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆