与 MRI Ruby 的并发请求 [英] Concurrent requests with MRI Ruby

查看:17
本文介绍了与 MRI Ruby 的并发请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我整理了一个简单的示例,试图使用一个基本示例来证明 Rails 中的并发请求.请注意,我使用的是 MRI Ruby2 和 Rails 4.2.

I put together a simple example trying to prove concurrent requests in Rails using a basic example. Note that I am using MRI Ruby2 and Rails 4.2.

  def api_call
    sleep(10)
    render :json => "done"
  end

然后我在我的 Mac(I7/4 Core)上的 Chrome 中访问 4 个不同的选项卡,看看它们是串行还是并行运行(实际上是并发的,这很接近但不是一回事).即,http://localhost:3000/api_call

I then go to 4 different tabs in Chrome on my mac (I7 / 4 Core) and see if they get run in series or parallel (really concurrent which is close but not the same thing). i.e., http://localhost:3000/api_call

我无法使用 Puma、Thin 或 Unicorn 使其工作.每个请求都是连续出现的.10 秒后的第一个标签,20 秒后的第二个(因为它必须等待第一个完成),之后的第三个......

I cannot get this to work using Puma, Thin, or Unicorn. The requests each come by in series. First tab after 10 seconds, second after 20 (since it had to wait for the first to complete), third after that....

根据我所阅读的内容,我相信以下内容是正确的(请纠正我)并且是我的结果:

From what I have read, I believe the following to be true (please correct me) and were my results:

  • Unicorn 是多进程的,我的示例应该可以工作(在 unicorn.rb 配置文件中定义工作人员数量之后),但它没有.我可以看到 4 个工人开始,但一切都在串联.我正在使用 unicorn-rails gem,使用 unicorn -c config/unicorn.rb 启动 rails,在我的 unicorn.rb 中我有:

-- unicorn.rb

-- unicorn.rb

worker_processes 4
preload_app true
timeout 30
listen 3000
after_fork do |server, worker|
  ActiveRecord::Base.establish_connection
end

  • Thin 和 Puma 是多线程的(尽管 Puma 至少有一个集群化"模式,您可以使用 -w 参数启动工作程序)并且无论如何都不应该使用 MRI Ruby2.0(在多线程模式下),因为有一个全局解释器锁(GIL)确保一次只能运行一个线程".
    • Thin and Puma are multithreaded (although Puma at least has a 'clustered' mode where you can start workers with a -w parameter) and should not work anyways (in multithreaded mode) with MRI Ruby2.0 because "there is a Global Interpreter Lock (GIL) that ensures only one thread can be run at a time".
    • 所以,

      • 我是否有一个有效的例子(或者使用 sleep 是错误的)?
      • 我上面关于多进程和多线程(关于 MRI Rails 2)的陈述是否正确?
      • 关于为什么我不能让它与 Unicorn(或任何与此相关的任何服务器)一起工作的任何想法?

      有一个非常我的类似问题 但我无法让它按答案工作,并且它没有回答我关于使用 MRI Ruby 的并发请求的所有问题.

      There is a very similar question to mine but I can't get it working as answered and it doesn't answer all of my questions about concurrent requests using MRI Ruby.

      Github 项目:https://github.com/afrankel/limitedBandwidth(注意:项目是在服务器上查看的不仅仅是这个多进程/线程的问题)

      Github project: https://github.com/afrankel/limitedBandwidth (note: project is looking at more than this question of multi-process/threading on the server)

      推荐答案

      我邀请您阅读 Jesse Storimer 的系列没有人理解吉尔它可能会帮助您更好地了解一些 MRI 内部结构.

      I invite you to read the series of Jesse Storimer's Nobody understands the GIL It might help you understand better some MRI internals.

      我还发现了 Pragmatic Concurrency with Ruby,读起来很有趣.它有一些并发测试的例子.

      I have also found Pragmatic Concurrency with Ruby, which reads interesting. It has some examples of testing concurrently.

      另外我可以推荐文章删除config.threadsafe!可能与 Rails 4 无关,但它解释了配置选项,您可以使用其中之一来允许并发.

      In addition I can recommend the article Removing config.threadsafe! Might not be relevant for Rails 4, but it explains the configuration options, one of which you can use to allow concurrency.

      让我们讨论您问题的答案.

      Let's discuss the answer to your question.

      即使使用 Puma,您也可以拥有多个线程(使用 MRI).GIL 确保一次只有一个线程处于活动状态,这是开发人员称之为限制性的约束(因为没有真正的并行执行).请记住,GIL 不保证线程安全.这并不意味着其他线程没有运行,它们正在等待轮到它们.它们可以交错(文章可以帮助更好地理解).

      You can have several threads (using MRI), even with Puma. The GIL ensures that only one thread is active at a time, that is the constraint that developers dub as restrictive (because of no real parallel execution). Bear in mind that GIL does not guarantee thread safety. This does not mean that the other threads are not running, they are waiting for their turn. They can interleave (the articles can help understanding better).

      让我澄清一些术语:工作进程、线程.一个进程在单独的内存空间中运行,可以为多个线程提供服务.同一进程的线程在共享内存空间中运行,这是它们进程的共享内存空间.这里的线程是指 Ruby 线程,而不是 CPU 线程.

      Let me clear up some terms: worker process, thread. A process runs in a separate memory space and can serve several threads. Threads of the same process run in a shared memory space, which is that of their process. With threads we mean Ruby threads in this context, not CPU threads.

      关于你问题的配置和你分享的GitHub repo,我认为一个合适的配置(我用的是Puma)是设置4个worker和1到40个线程.这个想法是一名工人提供一张标签.每个标签最多发送 10 个请求.

      In regards to your question's configuration and the GitHub repo you shared, I think an appropriate configuration (I used Puma) is to set up 4 workers and 1 to 40 threads. The idea is that one worker serves one tab. Each tab sends up to 10 requests.

      让我们开始吧:

      我在虚拟机上使用 Ubuntu.所以首先我在我的虚拟机设置中启用了 4 个内核(以及我认为可能有帮助的一些其他设置).我可以在我的机器上验证这一点.所以我就这么做了.

      I work on Ubuntu on a virtual machine. So first I enabled the 4 cores in my virtual machine's setting (and some other settings of which I thought it might help). I could verify this on my machine. So I went with that.

      Linux command --> lscpu
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                4
      On-line CPU(s) list:   0-3
      Thread(s) per core:    1
      Core(s) per socket:    4
      Socket(s):             1
      NUMA node(s):          1
      Vendor ID:             GenuineIntel
      CPU family:            6
      Model:                 69
      Stepping:              1
      CPU MHz:               2306.141
      BogoMIPS:              4612.28
      L1d cache:             32K
      L1d cache:             32K
      L2d cache:             6144K
      NUMA node0 CPU(s):     0-3
      

      我使用了您共享的 GitHub 项目并对其稍作修改.我创建了一个名为 puma.rb 的 Puma 配置文件(放在 config 目录下),内容如下:

      I used your shared GitHub project and modified it slightly. I created a Puma configuration file named puma.rb (put it in the config directory) with the following content:

      workers Integer(ENV['WEB_CONCURRENCY'] || 1)
      threads_count = Integer(ENV['MAX_THREADS'] || 1)
      threads 1, threads_count
      
      preload_app!
      
      rackup      DefaultRackup
      port        ENV['PORT']     || 3000
      environment ENV['RACK_ENV'] || 'development'
      
      on_worker_boot do
        # Worker specific setup for Rails 4.1+
        # See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
        #ActiveRecord::Base.establish_connection
      end
      

      默认情况下,Puma 以 1 个工作线程和 1 个线程启动.您可以使用环境变量来修改这些参数.我这样做了:

      By default Puma is started with 1 worker and 1 thread. You can use environment variables to modify those parameters. I did so:

      export MAX_THREADS=40
      export WEB_CONCURRENCY=4
      

      要使用我输入的这个配置启动 Puma

      To start Puma with this configuration I typed

      bundle exec puma -C config/puma.rb
      

      在 Rails 应用目录中.

      in the Rails app directory.

      我用四个选项卡打开浏览器来调用应用程序的 URL.

      I opened the browser with four tabs to call the app's URL.

      第一个请求在 15:45:05 左右开始,最后一个请求在 15:49:44 左右.那是 4 分 39 秒的经过时间.您还可以在日志文件中以未排序的顺序查看请求的 ID.(见下文)

      The first request started around 15:45:05 and the last request was around 15h49:44. That is an elapsed time of 4 minutes and 39 seconds. Also you can see the request's id's in non sorted order in the log file. (See below)

      GitHub 项目中的每个 API 调用都会休眠 15 秒.我们有四个 4 个选项卡,每个选项卡有 10 个 API 调用.这使得最长经过时间为 600 秒,即 10 分钟(在严格的串行模式下).

      Each API call in the GitHub project sleeps for 15 seconds. We have four 4 tabs, each with 10 API calls. That makes a maximum elapsed time of 600 seconds, i.e. 10 minutes (in a strictly serial mode).

      理论上理想的结果应该是全部并行并且经过的时间离 15 秒不远,但我完全没有想到.我不确定结果会是什么,但我仍然感到非常惊讶(考虑到我在虚拟机上运行并且 MRI 受到 GIL 和其他一些因素的限制).本次测试的运行时间不到最大运行时间的一半(在严格串行模式下),我们将结果削减了不到一半.

      The ideal result in theory would be all in parallel and an elapsed time not far from 15 seconds, but I did not expect that at all. I was not sure what to expect as a result exactly, but I was still positively surprised (considering that I ran on a virtual machine and MRI is restrained by the GIL and some other factors). The elapsed time of this test was less than half the maximum elapsed time (in strictly serial mode), we cut the result into less than half.

      编辑 我进一步阅读了关于 Rack::Lock 的内容,它为每个请求包装了一个互斥锁(上面的第三篇文章).我找到了选项config.allow_concurrency = true 节省时间.一点警告是为了增加连接池(虽然请求不查询必须相应地设置数据库);最大线程数是一个很好的默认.在这种情况下为 40.

      EDIT I read further about the Rack::Lock that wraps a mutex around each request (Third article above). I found the option config.allow_concurrency = true to be a time saver. A little caveat was to increase the connection pool (though the request do no query the database had to be set accordingly); the number of maximum threads is a good default. 40 in this case.

      我使用 jRuby 测试了该应用程序,实际运行时间为 2 分钟,使用 allow_concurrency=true.

      I tested the app with jRuby and the actual elapsed time was 2mins, with allow_concurrency=true.

      我用 MRI 测试了该应用程序,实际运行时间为 1 分 47 秒,使用allow_concurrency=true.这对我来说是一个很大的惊喜.这真的让我感到惊讶,因为我预计 MRI 会比 JRuby 慢.它不是.这让我质疑关于 MRI 和 JRuby 之间速度差异的广泛讨论.

      I tested the app with MRI and the actual elapsed time was 1min47s, with allow_concurrency=true. This was a big surprise to me. This really surprised me, because I expected MRI to be slower than JRuby. It was not. This makes me questioning the widespread discussion about the speed differences between MRI and JRuby.

      现在查看不同标签上的回复更加随机"了.碰巧选项卡 3 或选项卡 4 在选项卡 1 之前完成,这是我首先请求的.

      Watching the responses on the different tabs are "more random" now. It happens that tab 3 or 4 completes before tab 1, which I requested first.

      我认为因为您没有竞争条件,所以测试似乎是行.但是,我不确定应用范围的后果,如果您在实际应用中设置 config.allow_concurrency=true.

      I think because you don't have race conditions the test seems to be OK. However, I am not sure about the application wide consequences if you set config.allow_concurrency=true in a real world application.

      请随时查看并告诉我您的读者可能有的任何反馈.我的机器上还有克隆.如果您有兴趣,请告诉我.

      Feel free to check it out and let me know any feedback you readers might have. I still have the clone on my machine. Let me know if you are interested.

      按顺序回答您的问题:

      • 我认为你的例子是有效的.但是,对于并发,最好使用共享资源进行测试(例如在第二篇文章中).
      • 关于你的陈述,如本文开头所述回答,MRI 是多线程的,但受 GIL 限制为一个活动一次线程.这就提出了一个问题:使用 MRI 不是更好吗?使用更多进程和更少线程进行测试?我真的不知道,一个第一个猜测是没有或没有太大区别.也许有人可以对此有所了解.
      • 我认为你的例子很好.只需要一点点修改.

      日志文件 Rails 应用程序:

      Log file Rails app:

      **config.allow_concurrency = false (by default)**
      -> Ideally 1 worker per core, each worker servers up to 10 threads.
      
      [3045] Puma starting in cluster mode...
      [3045] * Version 2.11.2 (ruby 2.1.5-p273), codename: Intrepid Squirrel
      [3045] * Min threads: 1, max threads: 40
      [3045] * Environment: development
      [3045] * Process workers: 4
      [3045] * Preloading application
      [3045] * Listening on tcp://0.0.0.0:3000
      [3045] Use Ctrl-C to stop
      [3045] - Worker 0 (pid: 3075) booted, phase: 0
      [3045] - Worker 1 (pid: 3080) booted, phase: 0
      [3045] - Worker 2 (pid: 3087) booted, phase: 0
      [3045] - Worker 3 (pid: 3098) booted, phase: 0
      Started GET "/assets/angular-ui-router/release/angular-ui-router.js?body=1" for 127.0.0.1 at 2015-05-11 15:45:05 +0800
      ...
      ...
      ...
      Processing by ApplicationController#api_call as JSON
        Parameters: {"t"=>"15?id=9"}
      Completed 200 OK in 15002ms (Views: 0.2ms | ActiveRecord: 0.0ms)
      [3075] 127.0.0.1 - - [11/May/2015:15:49:44 +0800] "GET /api_call.json?t=15?id=9 HTTP/1.1" 304 - 60.0230
      

      <小时>

      **config.allow_concurrency = true**
      -> Ideally 1 worker per core, each worker servers up to 10 threads.
      
      [22802] Puma starting in cluster mode...
      [22802] * Version 2.11.2 (ruby 2.2.0-p0), codename: Intrepid Squirrel
      [22802] * Min threads: 1, max threads: 40
      [22802] * Environment: development
      [22802] * Process workers: 4
      [22802] * Preloading application
      [22802] * Listening on tcp://0.0.0.0:3000
      [22802] Use Ctrl-C to stop
      [22802] - Worker 0 (pid: 22832) booted, phase: 0
      [22802] - Worker 1 (pid: 22835) booted, phase: 0
      [22802] - Worker 3 (pid: 22852) booted, phase: 0
      [22802] - Worker 2 (pid: 22843) booted, phase: 0
      Started GET "/" for 127.0.0.1 at 2015-05-13 17:58:20 +0800
      Processing by ApplicationController#index as HTML
        Rendered application/index.html.erb within layouts/application (3.6ms)
      Completed 200 OK in 216ms (Views: 200.0ms | ActiveRecord: 0.0ms)
      [22832] 127.0.0.1 - - [13/May/2015:17:58:20 +0800] "GET / HTTP/1.1" 200 - 0.8190
      ...
      ...
      ...
      Completed 200 OK in 15003ms (Views: 0.1ms | ActiveRecord: 0.0ms)
      [22852] 127.0.0.1 - - [13/May/2015:18:00:07 +0800] "GET /api_call.json?t=15?id=10 HTTP/1.1" 304 - 15.0103
      

      <小时>

      **config.allow_concurrency = true (by default)**
      -> Ideally each thread serves a request.
      
      Puma starting in single mode...
      * Version 2.11.2 (jruby 2.2.2), codename: Intrepid Squirrel
      * Min threads: 1, max threads: 40
      * Environment: development
      NOTE: ActiveRecord 4.2 is not (yet) fully supported by AR-JDBC, please help us finish 4.2 support - check http://bit.ly/jruby-42 for starters
      * Listening on tcp://0.0.0.0:3000
      Use Ctrl-C to stop
      Started GET "/" for 127.0.0.1 at 2015-05-13 18:23:04 +0800
      Processing by ApplicationController#index as HTML
        Rendered application/index.html.erb within layouts/application (35.0ms)
      ...
      ...
      ...
      Completed 200 OK in 15020ms (Views: 0.7ms | ActiveRecord: 0.0ms)
      127.0.0.1 - - [13/May/2015:18:25:19 +0800] "GET /api_call.json?t=15?id=9 HTTP/1.1" 304 - 15.0640
      

      这篇关于与 MRI Ruby 的并发请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆