在stdout线程上并发写入是否安全? [英] is concurrent write on stdout threadsafe?
问题描述
下面的代码不会引发数据争用
程序包主要进口 ("fmt""os"字符串")func main(){x:= strings.Repeat(,",1024)转到func(){为了 {fmt.Fprintf(os.Stdout,x +"aa \ n")}}()转到func(){为了 {fmt.Fprintf(os.Stdout,x +"bb \ n")}}()转到func(){为了 {fmt.Fprintf(os.Stdout,x +"cc \ n")}}()转到func(){为了 {fmt.Fprintf(os.Stdout,x +"dd \ n")}}()< -make(chan bool)}
我尝试了多种长度的数据,使用的是 https://play.golang.org/p/29Cnwqj5K30
这篇文章说不是TS.
这封邮件并非真正回答问题,否则我听不懂.
我不确定它是否可以作为明确的答案,但我会尽力提供一些见识.
fmt
包的 F *
功能仅声明它们采用实现 io.Writer
接口的类型的值,并调用写
就可以了.这些函数本身可以安全地并发使用-从某种意义上说,可以并发调用任何数量的 fmt.Fwhaveter
都是可以的:程序包本身已为此做好了准备,但是在Go中支持接口并没有说明任何关于实类型并发的信息.
fmt
的功能要写入其中.(还应记住,允许使用 fmt.* Print *
函数在其目标位置多次调用 Write
,这与库存提供的功能相反.包 log
.)因此,我们基本上有两种情况:
-
io.Writer
的自定义实现. - 它的常规实现,例如
* os.File
或net
包功能产生的套接字周围的包装.
第一种情况很简单:实施者所做的一切.
第二种情况更难:据我所知,Go标准库对此的立场(尽管在文档中没有明确说明)是因为它提供了围绕事物"的包装.操作系统提供的文件描述符和套接字等文件合理"地瘦"了,因此,无论它们实现的是什么语义,都由在特定系统上运行的stdlib代码可传递地实现.
例如,POSIX 要求 write(2)
调用在操作常规文件或符号链接时彼此之间是原子的 .这意味着,因为对 Write
的任何调用包装文件描述符或套接字的东西实际上导致单个写"操作.在tagret系统的系统调用中,您可以查阅目标操作系统的文档并了解将要发生的事情.
请注意,POSIX仅告诉文件系统对象,以及是否向终端(或伪终端)或管道或任何其他支持的对象打开了
syscall,结果将取决于相关子系统和/或驱动程序的实现方式,例如,来自多个并发调用的数据可能会散布,或者其中一个调用或两者都可能只是因操作系统而失败-不太可能,但仍然如此. os.Stdout
.write(2)
回到Go,据我了解,关于封装文件描述符和套接字的Go stdlib类型,以下事实成立:
- 它们可以安全地自己并发使用(我是说,在Go级别).
- 他们映射"了
Write
和Read
对基础对象进行一对一调用-也就是说,Write
调用永远不会拆分为两个或多个基础syscall,并且Read
调用从不返回胶合"数据.来自多个基础系统调用的结果.(顺便说一句,人们偶尔会被这种轻描淡写的行为所绊倒,例如,请参见此或- 不可能与类似文件的"文件进行数据竞争.Go stdlib在您提出的问题中提供的设置中提供的对象.
- 真正的问题不是Go程序级别的数据争用,而是操作系统级别上对单个资源的并发访问.在那儿,我们(通常)不谈论数据竞赛,因为商品OS Go支持将可能写入"到网络的东西暴露出来.作为一种抽象,实际的数据争用可能会指示内核或驱动程序中的错误(并且Go的争用检测器将始终无法检测到该错误,因为该内存不会由为该过程提供动力的Go运行时所拥有)
基本上,在您的情况下,如果您需要确保对 fmt.Fprint *
的任何特定调用所生成的数据作为操作系统提供的实际数据接收器的单个连续块而出现,您需要对这些调用进行序列化,因为 fmt
包不对所提供的"writer"上对 Write
的调用次数提供任何保证.导出的功能.
序列化可以是外部的(显式的,即获取锁,调用 fmt.Fprint *
,释放锁"),也可以是内部的-通过包装 os.Stdout
可以管理并使用锁的自定义类型).而当我们使用它时, log
包就可以做到这一点,并且可以直接用作"loggers".它提供了包括默认值在内的允许禁止输出"log headers"的功能.(例如时间戳和文件名).
below code does not throw a data race
package main
import (
"fmt"
"os"
"strings"
)
func main() {
x := strings.Repeat(" ", 1024)
go func() {
for {
fmt.Fprintf(os.Stdout, x+"aa\n")
}
}()
go func() {
for {
fmt.Fprintf(os.Stdout, x+"bb\n")
}
}()
go func() {
for {
fmt.Fprintf(os.Stdout, x+"cc\n")
}
}()
go func() {
for {
fmt.Fprintf(os.Stdout, x+"dd\n")
}
}()
<-make(chan bool)
}
I tried multiple length of data, with variant https://play.golang.org/p/29Cnwqj5K30
This post says it is not TS.
This mail does not really answer the question, or I did not understand.
Package documentation of os and fmt dont mention much about this. I admit i did not dig the source code of those two packages to find further explanations, they appear too complex to me.
What are the recommendations and their references ?
I'm not sure it would qualify as a definitive answer but I'll try to provide some insight.
The F*
-functions of the fmt
package merely state they take a value of a type implementing io.Writer
interface and call Write
on it.
The functions themselves are safe for concurrent use — in the sense it's OK to call any number of fmt.Fwhaveter
concurrently: the package itself is prepared for that,
but supporting of an interface in Go does not state anything about the real type concurrency-wise.
In other words, the real point of where the concurrency may or may not be allowed is deferred to the "writer" which the functions of fmt
write to.
(One should also keep in mind that the fmt.*Print*
functions are allowed to call Write
on its destination any number of times — as opposed to those provided by the stock package log
.)
So, we basically have two cases:
- Custom implementations of
io.Writer
. - Stock implementations of it, such as
*os.File
or wrappers around sockets produced by the functions ofnet
package.
The first case is the simple one: whatever the implementor did.
The second case is harder: as I understand, the Go standard library's stance on this (albeit not clearly stated in the docs) in that the wrappers it provides around "things" provided by the OS—such as file descriptors and sockets—are reasonably "thin", and hence whatever semantics they implement, is transitively implemented by the stdlib code running on a particular system.
For instance, POSIX requires that write(2)
calls are atomic with regard to one another when they are operating on regular files or symbolic links. This means, since any call to Write
on things wrapping file descriptors or sockets actually results in a single "write" syscall of the tagret system, you might consult the docs of the target OS and get the idea of what will happen.
Note that POSIX only tells about filesystem objects, and if os.Stdout
is opened to a terminal (or a pseudo-terminal) or to a pipe or to anything else which supports the write(2)
syscall, the results will depend on what the relevant subsystem and/or the driver implement—for instance, data from multiple concurrent calls may be interspersed, or one of the calls, or both, may just be failed by the OS—unlikely, but still.
Going back to Go, from what I gather, the following facts hold true about the Go stdlib types which wrap file descriptors and sockets:
- They are safe for concurrent use by themselves (I mean, on the Go level).
- They "map"
Write
andRead
calls 1-to-1 to the underlying object—that is, aWrite
call is never split into two or more underlying syscalls, and aRead
call never returns data "glued" from the results of multiple underlying syscalls. (By the way, people occasionally get tripped by this no-frills behaviour — for example, see this or this as examples.)
So basically when we consider this with the fact fmt.*Print*
are free to call Write
any number of times per a single call, your examples which use os.Stdout
, will:
- Never result in a data race — unless you've assigned the variable
os.Stdout
some custom implementation, — but - The data actually written to the underlying FD will be intermixed in an unpredictable order which may depend on many factors including the OS kernel version and settings, the version of Go used to build the program, the hardware and the load on the system.
TL;DR
- Multiple concurrent calls to
fmt.Fprint*
writing to the same "writer" value defer their concurrency to the implementation (type) of the "writer". - It's impossible to have a data race with "file-like" objects provided by the Go stdlib in the setup you have presented in your question.
- The real problem will be not with data races on the Go program level but with the concurrent access to a single resource happening on level of the OS. And there, we do not (usually) speak about data races because the commodity OSes Go supports expose things one may "write to" as abstractions, where a real data race would possibly indicate a bug in the kernel or in the driver (and the Go's race detector won't be able to detect it anyway as that memory would not be owned by the Go runtime powering the process).
Basically, in your case, if you need to be sure the data produced by any particular call to fmt.Fprint*
comes out as a single contiguous piece to the actual data receiver provided by the OS, you need to serialize these calls as the fmt
package provides no guarantees regarding the number of calls to Write
on the supplied "writer" for the functions it exports.
The serialization may either be external (explicit, that is "take a lock, call fmt.Fprint*
, release the lock") or internal — by wrapping the os.Stdout
in a custom type which would manage a lock, and using it).
And while we're at it, the log
package does just that, and can be used straight away as the "loggers" it provides, including the default one, allow to inhibit outputting of "log headers" (such as the timestamp and the name of the file).
这篇关于在stdout线程上并发写入是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!