使用从[]字节到字符串的不安全转换的可能后果是什么? [英] What are the possible consequences of using unsafe conversion from []byte to string in go?

查看:173
本文介绍了使用从[]字节到字符串的不安全转换的可能后果是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

[] byte 转换为 string 的首选方式是:

  var b [] byte 
// fill b
s:= string(b)

在这段代码中,字节片被复制,这在性能很重要的情况下可能是一个问题。



当性能很关键时,可以考虑执行不安全的转换:

  var b [] byte 
// fill b
s:= *(* string)(unsafe.Pointer(& b))

我的问题是:使用不安全转换时会出现什么问题?我知道 string 应该是不可变的,如果我们改变 b s 也将被改变。仍然:那又如何?是否所有的事情都可能发生?

解决方案

修改语言规范保证不可改变的事情是一种叛逆行为。 / p>

既然spec保证了 string s是不可变的,编译器可以生成缓存它们的值并执行其他操作的代码基于此优化。你不能以任何正常的方式改变字符串的值,并且如果你使用脏方式(比如包 unsafe )仍然这样做,你失去了规范提供的所有保证,并继续使用修改过的字符串,你可能会遇到bug和意想不到的事情例如,如果您在地图中使用字符串作为键,并且您将 string 之后,您可能无法使用字符串的原始值或修改值在映射中找到关联的值(这是依赖于实现的)。



为了演示这个,请看这个例子:

 $ bs:= *(* string)(unsafe.Pointer())(){code> m:= map [string] int {} 
b:= [ &b; b))
m [s] = 999

fmt.Println(Before:,m)

b [0] ='b'
fmt.Println(After:,m)

fmt.Println(但它在那里:,m [s],m [bi])

为我:= 0;我< 1000; i ++ {
m [strconv.Itoa(i)] = i
}
fmt.Println(Now is GONE:,m [s],m [bi])
for k,v:= range m {
if k ==bi{
fmt.Println(但是仍然存在,只是在不同的桶中:,k,v)






输出(在 Go Playground ):

 之前:map [hi:999] 
之后:map [bi:< nil>
但它在那里:999 999
现在它已经结束:0 0
但仍然在那里,只是在一个不同的桶:bi 999

起初,我们看到一些奇怪的结果:简单 Println()无法找到它的值。它看到了一些东西(键被找到),但是值显示为 nil ,它甚至不是值类型的有效值 int (零$的价值为 int 0 )。



如果我们将地图增大(我们添加1000个元素),地图的内部数据结构会得到重新构建。在此之后,我们甚至无法通过使用适当的键明确要求我们的价值。它仍然在映射中遍历我们找到的所有键 - 值对,但由于散列码随着字符串的值更改而变化,因此很可能会搜索它因为在不同的存储桶中它们的位置(或它应该在哪里)。

还要注意,使用包不安全可能会按照您现在的预期工作,但相同的代码可能与未来(或旧版)的Go完全不同(意味着它可能会中断),因为导入不安全的程序包可能不可移植,并且不受Go 1兼容性准则的保护。

您也可能遇到意外错误,因为修改过的字符串可能以不同的方式使用。有人可能会复制字符串标题,有人可能会复制其内容。看看这个例子:

  b:= [] byte {'h','i'} 
s:= * (* string)(unsafe.Pointer(& b))

s2:= s //复制字符串头文件
s3:= string([] byte(s))// New (s,s2,s3)
b [0] ='b'

fmt.Println(s == s2)
fmt.Println(s == s3)

我们创建了2个新的局部变量 s2 s3 使用 s s2 通过复制 s s3 的字符串标头初始化为新的字符串值(新字符串标题),但具有相同的内容。现在如果你修改了原来的 s ,你会期望在一个正确的程序中比较新的字符串和原始的字符串,你会得到相同的结果,不管它是 true false (根据数值是否被缓存,但应该是相同的)。

但是输出结果是(在 Go Playground 上试用):

  hi hi hi 
true
false


The preferred way of converting []byte to string is this:

var b []byte
// fill b
s := string(b)

In this code byte slice is copied, which can be a problem in situations where performance is important.

When performance is critical, one can consider performing the unsafe conversion:

var b []byte
// fill b
s :=  *(*string)(unsafe.Pointer(&b))

My question is: what can go wrong when using the unsafe conversion? I known that string should be immutable and if we change b, s will also be changed. And still: so what? Is it all bad that can happen?

解决方案

Modifying something that the language spec guarantees to be immutable is an act of treason.

Since the spec guarantees that strings are immutable, compilers are allowed to generate code that caches their values and does other optimization based on this. You can't change values of strings in any normal way, and if you resort to dirty ways (like package unsafe) to still do it, you lose all the guarantees provided by the spec, and by continuing to use the modified strings, you may bump into "bugs" and unexpected things randomly.

For example if you use a string as a key in a map and you change the string after you put it into the map, you might not be able to find the associated value in the map using either the original or the modified value of the string (this is implementation dependent).

To demonstrate this, see this example:

m := map[string]int{}
b := []byte("hi")
s := *(*string)(unsafe.Pointer(&b))
m[s] = 999

fmt.Println("Before:", m)

b[0] = 'b'
fmt.Println("After:", m)

fmt.Println("But it's there:", m[s], m["bi"])

for i := 0; i < 1000; i++ {
    m[strconv.Itoa(i)] = i
}
fmt.Println("Now it's GONE:", m[s], m["bi"])
for k, v := range m {
    if k == "bi" {
        fmt.Println("But still there, just in a different bucket: ", k, v)
    }
}

Output (try it on the Go Playground):

Before: map[hi:999]
After: map[bi:<nil>]
But it's there: 999 999
Now it's GONE: 0 0
But still there, just in a different bucket:  bi 999

At first, we just see some weird result: simple Println() is not able to find its value. It sees something (key is found), but value is displayed as nil which is not even a valid value for the value type int (zero value for int is 0).

If we grow the map to be big (we add 1000 elements), internal data structure of the map gets restructured. After this, we're not even able to find our value by explicitly asking for it with the appropriate key. It is still in the map as iterating over all its key-value pairs we find it, but since hash code changes as the value of the string changes, most likely it is searched for in a different bucket than where it is (or where it should be).

Also note that code using package unsafe may work as you expect it now, but the same code might work completely differently (meaning it may break) with a future (or old) version of Go as "packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines".

Also you may run into unexpected errors as the modified string might be used in different ways. Someone might just copy the string header, someone may copy its content. See this example:

b := []byte{'h', 'i'}
s := *(*string)(unsafe.Pointer(&b))

s2 := s                 // Copy string header
s3 := string([]byte(s)) // New string header but same content
fmt.Println(s, s2, s3)
b[0] = 'b'

fmt.Println(s == s2)
fmt.Println(s == s3)

We created 2 new local variables s2 and s3 using s, s2 initialized by copying the string header of s, and s3 is initialized with a new string value (new string header) but with the same content. Now if you modify the original s, you would expect in a correct program that comparing the new strings to the original you would get the same result be it either true or false (based on if values were cached, but should be the same).

But the output is (try it on the Go Playground):

hi hi hi
true
false

这篇关于使用从[]字节到字符串的不安全转换的可能后果是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆