如何知道 ruby​​ 中什么不是线程安全的? [英] how to know what is NOT thread-safe in ruby?

查看:25
本文介绍了如何知道 ruby​​ 中什么不是线程安全的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从 Rails 4 开始,一切都必须在线程中运行默认环境.这意味着我们编写的所有代码AND 所有我们使用的 gem 都必须线程安全

starting from Rails 4, everything would have to run in threaded environment by default. What this means is all of the code we write AND ALL the gems we use are required to be threadsafe

所以,我对此有几个问题:

so, I have few questions on this:

  1. 在 ruby​​/rails 中什么不是线程安全的?Vs 什么是 ruby​​/rails 中的线程安全?
  2. 是否有一个已知线程安全的 gem 列表,反之亦然?
  3. 是否有非线程安全示例的常见代码模式列表@result ||= some_method?
  4. ruby lang 核心中的数据结构(例如Hash 等)是否是线程安全的?
  5. 在 MRI 上,有一个 GVL/GIL 意味着除了IO 一次只能运行1 个ruby 线程,线程安全的改变对我们有影响吗?
  1. what is NOT thread-safe in ruby/rails? Vs What is thread-safe in ruby/rails?
  2. Is there a list of gems that is known to be threadsafe or vice-versa?
  3. is there List of common patterns of code which are NOT threadsafe example @result ||= some_method?
  4. Are the data structures in ruby lang core such as Hash etc threadsafe?
  5. On MRI, where there a GVL/GIL which means only 1 ruby thread can run at a time except for IO, does the threadsafe change effect us?

推荐答案

没有一个核心数据结构是线程安全的.我所知道的唯一一个 Ruby 附带的是标准库中的队列实现(require 'thread'; q = Queue.new).

None of the core data structures are thread safe. The only one I know of that ships with Ruby is the queue implementation in the standard library (require 'thread'; q = Queue.new).

MRI 的 GIL 并没有使我们免于线程安全问题.它只能确保两个线程不能同时运行 Ruby 代码,即同时在两个不同的 CPU 上运行.线程仍然可以在代码中的任何一点暂停和恢复.如果你写的代码像 @n = 0;3.times { Thread.start { 100.times { @n += 1 } } } 例如从多个线程改变共享变量,之后共享变量的值是不确定的.GIL 或多或少是对单核系统的模拟,它不会改变编写正确并发程序的基本问题.

MRI's GIL does not save us from thread safety issues. It only makes sure that two threads cannot run Ruby code at the same time, i.e. on two different CPUs at the exact same time. Threads can still be paused and resumed at any point in your code. If you write code like @n = 0; 3.times { Thread.start { 100.times { @n += 1 } } } e.g. mutating a shared variable from multiple threads, the value of the shared variable afterwards is not deterministic. The GIL is more or less a simulation of a single core system, it does not change the fundamental issues of writing correct concurrent programs.

即使 MRI 像 Node.js 一样是单线程的,您仍然需要考虑并发性.带有递增变量的示例可以正常工作,但您仍然可以获得竞争条件,其中事情以不确定的顺序发生并且一个回调破坏了另一个回调的结果.单线程异步系统更容易推理,但它们并非没有并发问题.试想一个有多个用户的应用程序:如果两个用户或多或少同时在 Stack Overflow 帖子上点击编辑,花一些时间编辑帖子然后点击保存,第三个用户稍后会看到他们的更改阅读同一篇文章?

Even if MRI had been single-threaded like Node.js you would still have to think about concurrency. The example with the incremented variable would work fine, but you can still get race conditions where things happen in non-deterministic order and one callback clobbers the result of another. Single threaded asynchronous systems are easier to reason about, but they are not free from concurrency issues. Just think of an application with multiple users: if two users hit edit on a Stack Overflow post at more or less the same time, spend some time editing the post and then hit save, whose changes will be seen by a third user later when they read that same post?

在 Ruby 中,与大多数其他并发运行时一样,任何不止一个操作的操作都不是线程安全的.@n += 1 不是线程安全的,因为它是多个操作.@n = 1 是线程安全的,因为它是一种操作(它有很多幕后操作,如果我试图详细描述为什么它是线程安全的",我可能会遇到麻烦,但是最终你不会从作业中得到不一致的结果).@n ||= 1, 不是,也没有其他速记操作 + 赋值.我犯过很多次的错误是编写 return ,除非@started;@started = true,这根本不是线程安全的.

In Ruby, as in most other concurrent runtimes, anything that is more than one operation is not thread safe. @n += 1 is not thread safe, because it is multiple operations. @n = 1 is thread safe because it is one operation (it's lots of operations under the hood, and I would probably get into trouble if I tried to describe why it's "thread safe" in detail, but in the end you will not get inconsistent results from assignments). @n ||= 1, is not and no other shorthand operation + assignment is either. One mistake I've made many times is writing return unless @started; @started = true, which is not thread safe at all.

我不知道任何关于 Ruby 线程安全和非线程安全语句的权威列表,但有一个简单的经验法则:如果一个表达式只执行一个(无副作用)操作,它可能是线程安全的.例如:a + b 可以,a = b 也可以,a.foo(b) 也可以,如果方法 foo 没有副作用(因为在 Ruby 中几乎任何事情都是方法调用,甚至在许多情况下是赋值,这也适用于其他示例).在这种情况下,副作用意味着改变状态的事物.def foo(x);@x = x;结束 不是没有副作用.

I don't know of any authoritative list of thread safe and non-thread safe statements for Ruby, but there is a simple rule of thumb: if an expression only does one (side-effect free) operation it is probably thread safe. For example: a + b is ok, a = b is also ok, and a.foo(b) is ok, if the method foo is side-effect free (since just about anything in Ruby is a method call, even assignment in many cases, this goes for the other examples too). Side-effects in this context means things that change state. def foo(x); @x = x; end is not side-effect free.

在 Ruby 中编写线程安全代码最困难的事情之一是所有核心数据结构,包括数组、哈希和字符串,都是可变的.很容易不小心泄露你的状态的一部分,当那部分是可变的时,事情就会变得很糟糕.考虑以下代码:

One of the hardest things about writing thread safe code in Ruby is that all core data structures, including array, hash and string, are mutable. It's very easy to accidentally leak a piece of your state, and when that piece is mutable things can get really screwed up. Consider the following code:

class Thing
  attr_reader :stuff

  def initialize(initial_stuff)
    @stuff = initial_stuff
    @state_lock = Mutex.new
  end

  def add(item)
    @state_lock.synchronize do
      @stuff << item
    end
  end
end

这个类的一个实例可以在线程之间共享,他们可以安全地向它添加东西,但是有一个并发错误(它不是唯一的):对象的内部状态通过stuff访问器.除了从封装的角度来看是有问题的,它还打开了并发蠕虫的罐头.也许有人拿走了这个数组并将其传递给其他地方,然后该代码认为它现在拥有该数组并且可以对它做任何想做的事情.

A instance of this class can be shared between threads and they can safely add things to it, but there's a concurrency bug (it's not the only one): the internal state of the object leaks through the stuff accessor. Besides being problematic from the encapsulation perspective, it also opens up a can of concurrency worms. Maybe someone takes that array and passes it on to somewhere else, and that code in turn thinks it now owns that array and can do whatever it wants with it.

另一个经典的 Ruby 示例如下:

Another classic Ruby example is this:

STANDARD_OPTIONS = {:color => 'red', :count => 10}

def find_stuff
  @some_service.load_things('stuff', STANDARD_OPTIONS)
end

find_stuff 在第一次使用时工作正常,但第二次返回其他内容.为什么?load_things 方法碰巧认为它拥有传递给它的选项哈希,并执行 color = options.delete(:color).现在 STANDARD_OPTIONS 常量不再具有相同的值.常量仅在它们引用的内容中是常量,它们不保证它们引用的数据结构的恒定性.试想一下,如果这段代码并发运行会发生什么.

find_stuff works fine the first time it's used, but returns something else the second time. Why? The load_things method happens to think it owns the options hash passed to it, and does color = options.delete(:color). Now the STANDARD_OPTIONS constant doesn't have the same value anymore. Constants are only constant in what they reference, they do not guarantee the constancy of the data structures they refer to. Just think what would happen if this code was run concurrently.

如果您避免共享可变状态(例如,多线程访问的对象中的实例变量,多线程访问的散列和数组等数据结构),线程安全就不是那么难了.尽量减少应用程序中并发访问的部分,并将精力集中在这些部分.IIRC,在 Rails 应用程序中,为每个请求创建一个新的控制器对象,因此它只会被单个线程使用,您从该控制器创建的任何模型对象也是如此.但是,Rails 也鼓励使用全局变量(User.find(...) 使用全局变量 User,你可能认为它只是一个类,而它是一个类,但它也是全局变量的命名空间),其中一些是安全的,因为它们是只读的,但有时您将内容保存在这些全局变量中,因为它很方便.使用可全局访问的任何内容时要非常小心.

If you avoid shared mutable state (e.g. instance variables in objects accessed by multiple threads, data structures like hashes and arrays accessed by multiple threads) thread safety isn't so hard. Try to minimize the parts of your application that are accessed concurrently, and focus your efforts there. IIRC, in a Rails application, a new controller object is created for every request, so it is only going to get used by a single thread, and the same goes for any model objects you create from that controller. However, Rails also encourages the use of global variables (User.find(...) uses the global variable User, you may think of it as only a class, and it is a class, but it is also a namespace for global variables), some of these are safe because they are read only, but sometimes you save things in these global variables because it is convenient. Be very careful when you use anything that is globally accessible.

在线程环境中运行 Rails 已经有一段时间了,所以如果不是 Rails 专家,我仍然会说,当涉及到 Rails 本身时,您不必担心线程安全.您仍然可以通过执行我上面提到的一些操作来创建非线程安全的 Rails 应用程序.当涉及到其他 gem 时,假设它们不是线程安全的,除非他们说它们是线程安全的,如果他们说它们不是线程安全的,然后查看它们的代码(但只是因为你看到它们像 @n ||= 1 并不意味着它们不是线程安全的,这是在正确的上下文中完全合法的事情——你应该在全局变量中寻找诸如可变状态之类的东西,它是如何处理的传递给其方法的可变对象,尤其是它如何处理选项哈希).

It's been possible to run Rails in threaded environments for quite a while now, so without being a Rails expert I would still go so far as to say that you don't have to worry about thread safety when it comes to Rails itself. You can still create Rails applications that aren't thread safe by doing some of the things I mention above. When it comes other gems assume that they are not thread safe unless they say that they are, and if they say that they are assume that they are not, and look through their code (but just because you see that they go things like @n ||= 1 does not mean that they are not thread safe, that's a perfectly legitimate thing to do in the right context -- you should instead look for things like mutable state in global variables, how it handles mutable objects passed to its methods, and especially how it handles options hashes).

最后,线程不安全是一个传递属性.任何使用非线程安全的东西本身就不是线程安全的.

Finally, being thread unsafe is a transitive property. Anything that uses something that is not thread safe is itself not thread safe.

这篇关于如何知道 ruby​​ 中什么不是线程安全的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆