MRI Ruby 的并发请求 [英] Concurrent requests with MRI Ruby

查看:11
本文介绍了MRI Ruby 的并发请求的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我整理了一个简单的例子,试图用一个基本的例子来证明 Rails 中的并发请求.请注意,我使用的是 MRI Ruby2 和 Rails 4.2.

I put together a simple example trying to prove concurrent requests in Rails using a basic example. Note that I am using MRI Ruby2 and Rails 4.2.

  def api_call
    sleep(10)
    render :json => "done"
  end

然后,我在我的 Mac(I7/4 核)上的 Chrome 中转到 4 个不同的选项卡,看看它们是串行运行还是并行运行(实际上是并发的,这很接近,但不是一回事).即,http://localhost:3000/api_call

I then go to 4 different tabs in Chrome on my mac (I7 / 4 Core) and see if they get run in series or parallel (really concurrent which is close but not the same thing). i.e., http://localhost:3000/api_call

我无法使用 Puma、Thin 或 Unicorn 实现此功能.每个请求都是按顺序发出的.10 秒后的第一个选项卡,20 秒后的第二个选项卡(因为它必须等待第一个完成),之后的第三个......

I cannot get this to work using Puma, Thin, or Unicorn. The requests each come by in series. First tab after 10 seconds, second after 20 (since it had to wait for the first to complete), third after that....

根据我的阅读,我相信以下内容是正确的(请纠正我)并且是我的结果:

From what I have read, I believe the following to be true (please correct me) and were my results:

  • Unicorn 是多进程的,我的示例应该可以工作(在 unicorn.rb 配置文件中定义了工作人员的数量之后),但它没有.我可以看到 4 名工人开始工作,但一切都是串联的.我正在使用 unicorn-rails gem,使用 unicorn -c config/unicorn.rb 启动 rails,在我的 unicorn.rb 中我有:

-- 独角兽.rb

worker_processes 4
preload_app true
timeout 30
listen 3000
after_fork do |server, worker|
  ActiveRecord::Base.establish_connection
end

  • Thin 和 Puma 是多线程的(尽管 Puma 至少有一个 'clustered'您可以使用 -w 参数启动工作人员的模式)并且无论如何都不应该使用 MRI Ruby2.0(在多线程模式下)工作,因为有一个全局解释器锁 (GIL) 可以确保一次只能运行一个线程".
    • Thin and Puma are multithreaded (although Puma at least has a 'clustered' mode where you can start workers with a -w parameter) and should not work anyways (in multithreaded mode) with MRI Ruby2.0 because "there is a Global Interpreter Lock (GIL) that ensures only one thread can be run at a time".
    • 所以,

      • 我是否有一个有效的示例(或者使用 sleep 是错误的)?
      • 我上面关于多进程和多线程(关于 MRI Rails 2)的陈述是否正确?
      • 关于为什么我不能让它与 Unicorn(或任何服务器)一起工作的任何想法?

      我有一个非常类似的问题 但我无法让它像回答的那样工作,它也没有回答我关于使用 MRI Ruby 并发请求的所有问题.

      There is a very similar question to mine but I can't get it working as answered and it doesn't answer all of my questions about concurrent requests using MRI Ruby.

      Github 项目:https://github.com/afrankel/limitedBandwidth(注:项目是不仅仅关注服务器上的多进程/线程问题)

      Github project: https://github.com/afrankel/limitedBandwidth (note: project is looking at more than this question of multi-process/threading on the server)

      推荐答案

      我邀请你阅读Jesse Storimer的系列没有人了解吉尔它可能会帮助您更好地了解一些 MRI 内部结构.

      I invite you to read the series of Jesse Storimer's Nobody understands the GIL It might help you understand better some MRI internals.

      我还发现 Pragmatic Concurrency with Ruby,读起来很有趣.它有一些并发测试的例子.

      I have also found Pragmatic Concurrency with Ruby, which reads interesting. It has some examples of testing concurrently.

      另外我可以推荐文章Removing config.threadsafe!可能与 Rails 4 无关,但它解释了配置选项,您可以使用其中之一来允许并发.

      In addition I can recommend the article Removing config.threadsafe! Might not be relevant for Rails 4, but it explains the configuration options, one of which you can use to allow concurrency.

      让我们讨论一下你的问题的答案.

      Let's discuss the answer to your question.

      即使使用 Puma,您也可以拥有多个线程(使用 MRI).GIL 确保一次只有一个线程处于活动状态,这是开发人员称之为限制性的约束(因为没有真正的并行执行).请记住,GIL 不保证线程安全.这并不意味着其他线程没有运行,它们正在等待轮到它们.他们可以交错(文章可以帮助更好地理解).

      You can have several threads (using MRI), even with Puma. The GIL ensures that only one thread is active at a time, that is the constraint that developers dub as restrictive (because of no real parallel execution). Bear in mind that GIL does not guarantee thread safety. This does not mean that the other threads are not running, they are waiting for their turn. They can interleave (the articles can help understanding better).

      让我澄清一些术语:工作进程,线程.一个进程在单独的内存空间中运行,可以为多个线程提供服务.同一进程的线程在共享内存空间中运行,这是它们的进程的内存空间.在这种情况下,线程是指 Ruby 线程,而不是 CPU 线程.

      Let me clear up some terms: worker process, thread. A process runs in a separate memory space and can serve several threads. Threads of the same process run in a shared memory space, which is that of their process. With threads we mean Ruby threads in this context, not CPU threads.

      关于您的问题配置和您共享的 GitHub 存储库,我认为合适的配置(我使用 Puma)是设置 4 个工作人员和 1 到 40 个线程.这个想法是一个工人服务一个标签.每个选项卡最多发送 10 个请求.

      In regards to your question's configuration and the GitHub repo you shared, I think an appropriate configuration (I used Puma) is to set up 4 workers and 1 to 40 threads. The idea is that one worker serves one tab. Each tab sends up to 10 requests.

      让我们开始吧:

      我在虚拟机上使用 Ubuntu.所以首先我在我的虚拟机设置中启用了 4 个核心(以及我认为可能有帮助的一些其他设置).我可以在我的机器上验证这一点.所以我就这么做了.

      I work on Ubuntu on a virtual machine. So first I enabled the 4 cores in my virtual machine's setting (and some other settings of which I thought it might help). I could verify this on my machine. So I went with that.

      Linux command --> lscpu
      Architecture:          x86_64
      CPU op-mode(s):        32-bit, 64-bit
      Byte Order:            Little Endian
      CPU(s):                4
      On-line CPU(s) list:   0-3
      Thread(s) per core:    1
      Core(s) per socket:    4
      Socket(s):             1
      NUMA node(s):          1
      Vendor ID:             GenuineIntel
      CPU family:            6
      Model:                 69
      Stepping:              1
      CPU MHz:               2306.141
      BogoMIPS:              4612.28
      L1d cache:             32K
      L1d cache:             32K
      L2d cache:             6144K
      NUMA node0 CPU(s):     0-3
      

      我使用了您共享的 GitHub 项目并稍作修改.我创建了一个名为 puma.rb 的 Puma 配置文件(放在 config 目录下),内容如下:

      I used your shared GitHub project and modified it slightly. I created a Puma configuration file named puma.rb (put it in the config directory) with the following content:

      workers Integer(ENV['WEB_CONCURRENCY'] || 1)
      threads_count = Integer(ENV['MAX_THREADS'] || 1)
      threads 1, threads_count
      
      preload_app!
      
      rackup      DefaultRackup
      port        ENV['PORT']     || 3000
      environment ENV['RACK_ENV'] || 'development'
      
      on_worker_boot do
        # Worker specific setup for Rails 4.1+
        # See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
        #ActiveRecord::Base.establish_connection
      end
      

      默认情况下,Puma 以 1 个 worker 和 1 个线程启动.您可以使用环境变量来修改这些参数.我这样做了:

      By default Puma is started with 1 worker and 1 thread. You can use environment variables to modify those parameters. I did so:

      export MAX_THREADS=40
      export WEB_CONCURRENCY=4
      

      要使用我输入的这个配置启动 Puma

      To start Puma with this configuration I typed

      bundle exec puma -C config/puma.rb
      

      在 Rails 应用程序目录中.

      in the Rails app directory.

      我用四个标签打开浏览器来调用应用的 URL.

      I opened the browser with four tabs to call the app's URL.

      第一个请求在 15:45:05 左右开始,最后一个请求在 15:49:44 左右开始.那是4分39秒的经过时间.您还可以在日志文件中以未排序的顺序查看请求的 ID.(见下文)

      The first request started around 15:45:05 and the last request was around 15h49:44. That is an elapsed time of 4 minutes and 39 seconds. Also you can see the request's id's in non sorted order in the log file. (See below)

      GitHub 项目中的每个 API 调用都会休眠 15 秒.我们有四个 4 个选项卡,每个选项卡有 10 个 API 调用.这使得最长经过时间为 600 秒,即 10 分钟(在严格的串行模式下).

      Each API call in the GitHub project sleeps for 15 seconds. We have four 4 tabs, each with 10 API calls. That makes a maximum elapsed time of 600 seconds, i.e. 10 minutes (in a strictly serial mode).

      理论上理想的结果是全部并行,经过的时间不超过 15 秒,但我完全没想到.我不确定结果究竟会发生什么,但我仍然感到非常惊讶(考虑到我在虚拟机上运行并且 MRI 受到 GIL 和其他一些因素的限制).该测试的运行时间小于最大运行时间的一半(在严格串行模式下),我们将结果减半.

      The ideal result in theory would be all in parallel and an elapsed time not far from 15 seconds, but I did not expect that at all. I was not sure what to expect as a result exactly, but I was still positively surprised (considering that I ran on a virtual machine and MRI is restrained by the GIL and some other factors). The elapsed time of this test was less than half the maximum elapsed time (in strictly serial mode), we cut the result into less than half.

      编辑我进一步阅读了 Rack::Lock ,它在每个请求周围都包裹了一个互斥锁(上面的第三篇文章).我找到了选项config.allow_concurrency = true 以节省时间.一点警告是增加连接池(虽然请求没有查询必须相应地设置数据库);最大线程数是一个很好的默认值.在这种情况下为 40.

      EDIT I read further about the Rack::Lock that wraps a mutex around each request (Third article above). I found the option config.allow_concurrency = true to be a time saver. A little caveat was to increase the connection pool (though the request do no query the database had to be set accordingly); the number of maximum threads is a good default. 40 in this case.

      我用 jRuby 测试了应用程序,实际经过的时间是 2 分钟,使用 allow_concurrency=true.

      I tested the app with jRuby and the actual elapsed time was 2mins, with allow_concurrency=true.

      我用 MRI 测试了应用程序,实际经过的时间是 1 分钟 47 秒,使用allow_concurrency = true.这对我来说是一个很大的惊喜.这真的让我感到惊讶,因为我预计 MRI 会比 JRuby 慢.它不是.这让我质疑关于 MRI 和 JRuby 之间速度差异的广泛讨论.

      I tested the app with MRI and the actual elapsed time was 1min47s, with allow_concurrency=true. This was a big surprise to me. This really surprised me, because I expected MRI to be slower than JRuby. It was not. This makes me questioning the widespread discussion about the speed differences between MRI and JRuby.

      现在观看不同标签上的响应更加随机"了.碰巧标签 3 或 4 在标签 1 之前完成,这是我首先要求的.

      Watching the responses on the different tabs are "more random" now. It happens that tab 3 or 4 completes before tab 1, which I requested first.

      我认为因为您没有竞争条件,所以测试似乎是行.但是,我不确定应用程序范围内的后果,如果您在现实世界的应用程序中设置了 config.allow_concurrency=true.

      I think because you don't have race conditions the test seems to be OK. However, I am not sure about the application wide consequences if you set config.allow_concurrency=true in a real world application.

      请随时查看并告诉我读者可能有的任何反馈.我的机器上仍然有克隆.如果您有兴趣,请告诉我.

      Feel free to check it out and let me know any feedback you readers might have. I still have the clone on my machine. Let me know if you are interested.

      按顺序回答您的问题:

      • 我认为你的例子是有效的.然而,对于并发性,最好使用共享资源进行测试(例如在第二篇文章中).
      • 关于您的陈述,如本文开头所述回答,MRI 是多线程的,但受 GIL 限制为一个活动一次线程.这就提出了一个问题:使用 MRI 不是更好吗?用更多的进程和更少的线程进行测试?我真的不知道,一个第一个猜测是没有或没有太大区别.也许有人可以阐明这一点.
      • 我认为你的例子很好.只是需要一些轻微的修改.

      Rails 应用程序日志文件:

      Log file Rails app:

      **config.allow_concurrency = false (by default)**
      -> Ideally 1 worker per core, each worker servers up to 10 threads.
      
      [3045] Puma starting in cluster mode...
      [3045] * Version 2.11.2 (ruby 2.1.5-p273), codename: Intrepid Squirrel
      [3045] * Min threads: 1, max threads: 40
      [3045] * Environment: development
      [3045] * Process workers: 4
      [3045] * Preloading application
      [3045] * Listening on tcp://0.0.0.0:3000
      [3045] Use Ctrl-C to stop
      [3045] - Worker 0 (pid: 3075) booted, phase: 0
      [3045] - Worker 1 (pid: 3080) booted, phase: 0
      [3045] - Worker 2 (pid: 3087) booted, phase: 0
      [3045] - Worker 3 (pid: 3098) booted, phase: 0
      Started GET "/assets/angular-ui-router/release/angular-ui-router.js?body=1" for 127.0.0.1 at 2015-05-11 15:45:05 +0800
      ...
      ...
      ...
      Processing by ApplicationController#api_call as JSON
        Parameters: {"t"=>"15?id=9"}
      Completed 200 OK in 15002ms (Views: 0.2ms | ActiveRecord: 0.0ms)
      [3075] 127.0.0.1 - - [11/May/2015:15:49:44 +0800] "GET /api_call.json?t=15?id=9 HTTP/1.1" 304 - 60.0230
      

      <小时>

      **config.allow_concurrency = true**
      -> Ideally 1 worker per core, each worker servers up to 10 threads.
      
      [22802] Puma starting in cluster mode...
      [22802] * Version 2.11.2 (ruby 2.2.0-p0), codename: Intrepid Squirrel
      [22802] * Min threads: 1, max threads: 40
      [22802] * Environment: development
      [22802] * Process workers: 4
      [22802] * Preloading application
      [22802] * Listening on tcp://0.0.0.0:3000
      [22802] Use Ctrl-C to stop
      [22802] - Worker 0 (pid: 22832) booted, phase: 0
      [22802] - Worker 1 (pid: 22835) booted, phase: 0
      [22802] - Worker 3 (pid: 22852) booted, phase: 0
      [22802] - Worker 2 (pid: 22843) booted, phase: 0
      Started GET "/" for 127.0.0.1 at 2015-05-13 17:58:20 +0800
      Processing by ApplicationController#index as HTML
        Rendered application/index.html.erb within layouts/application (3.6ms)
      Completed 200 OK in 216ms (Views: 200.0ms | ActiveRecord: 0.0ms)
      [22832] 127.0.0.1 - - [13/May/2015:17:58:20 +0800] "GET / HTTP/1.1" 200 - 0.8190
      ...
      ...
      ...
      Completed 200 OK in 15003ms (Views: 0.1ms | ActiveRecord: 0.0ms)
      [22852] 127.0.0.1 - - [13/May/2015:18:00:07 +0800] "GET /api_call.json?t=15?id=10 HTTP/1.1" 304 - 15.0103
      

      <小时>

      **config.allow_concurrency = true (by default)**
      -> Ideally each thread serves a request.
      
      Puma starting in single mode...
      * Version 2.11.2 (jruby 2.2.2), codename: Intrepid Squirrel
      * Min threads: 1, max threads: 40
      * Environment: development
      NOTE: ActiveRecord 4.2 is not (yet) fully supported by AR-JDBC, please help us finish 4.2 support - check http://bit.ly/jruby-42 for starters
      * Listening on tcp://0.0.0.0:3000
      Use Ctrl-C to stop
      Started GET "/" for 127.0.0.1 at 2015-05-13 18:23:04 +0800
      Processing by ApplicationController#index as HTML
        Rendered application/index.html.erb within layouts/application (35.0ms)
      ...
      ...
      ...
      Completed 200 OK in 15020ms (Views: 0.7ms | ActiveRecord: 0.0ms)
      127.0.0.1 - - [13/May/2015:18:25:19 +0800] "GET /api_call.json?t=15?id=9 HTTP/1.1" 304 - 15.0640
      

      这篇关于MRI Ruby 的并发请求的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆