MRI Ruby的并发请求 [英] Concurrent requests with MRI Ruby

查看：53 发布时间：2020/5/13 19:34:00 ruby-on-rails ruby multithreading ruby-on-rails-4 multiprocessing

本文介绍了MRI Ruby的并发请求的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我整理了一个简单的示例，尝试使用一个基本示例来证明Rails中的并发请求.请注意，我正在使用MRI Ruby2和Rails 4.2.

I put together a simple example trying to prove concurrent requests in Rails using a basic example. Note that I am using MRI Ruby2 and Rails 4.2.

  def api_call
    sleep(10)
    render :json => "done"
  end

然后我在Mac(I7/4核心)的Chrome上转到4个不同的选项卡，看看它们是串行还是并行运行(真正的并发运行是很接近的，但不是同一件事).即 http://localhost:3000/api_call

I then go to 4 different tabs in Chrome on my mac (I7 / 4 Core) and see if they get run in series or parallel (really concurrent which is close but not the same thing). i.e., http://localhost:3000/api_call

我无法使用Puma，Thin或Unicorn来使用它.每个请求都是按顺序提出的. 10秒后显示第一个选项卡，20秒后显示第二个选项卡(因为必须等待第一个选项卡完成)，此后第三个选项卡....

I cannot get this to work using Puma, Thin, or Unicorn. The requests each come by in series. First tab after 10 seconds, second after 20 (since it had to wait for the first to complete), third after that....

根据我的阅读，我相信以下内容是正确的(请纠正我)，这是我的结果:

From what I have read, I believe the following to be true (please correct me) and were my results:

Unicorn是多进程的，我的示例应该已经工作了(在unicorn.rb配置文件中定义了工作程序的数量之后)，但没有成功.我可以看到有4个工作人员开始工作，但是所有工作都按顺序进行.我正在使用unicorn-rails gem，从unicorn -c config/unicorn.rb开始使用rails，在我的unicorn.rb中，我有:

-unicorn.rb

-- unicorn.rb

worker_processes 4
preload_app true
timeout 30
listen 3000
after_fork do |server, worker|
  ActiveRecord::Base.establish_connection
end

Thin和Puma是多线程的(尽管Puma至少具有'集群"可以使用-w参数启动工作模式的模式，而无论如何都不能与MRI Ruby2.0一起工作(在多线程模式下)，因为有一个全局解释器锁(GIL)，可以确保一次只能运行一个线程" .

Thin and Puma are multithreaded (although Puma at least has a 'clustered' mode where you can start workers with a -w parameter) and should not work anyways (in multithreaded mode) with MRI Ruby2.0 because "there is a Global Interpreter Lock (GIL) that ensures only one thread can be run at a time".

所以

我有一个有效的例子吗(或者使用睡眠是错误的)?
我上面关于多进程和多线程(相对于MRI Rails 2)的陈述正确吗?
关于为什么我无法使其与Unicorn(或与此相关的任何服务器)一起使用的任何想法?

与我的问题类似，但我无法使其正常运行，也无法回答我有关使用MRI Ruby进行并发请求的所有问题.

There is a very similar question to mine but I can't get it working as answered and it doesn't answer all of my questions about concurrent requests using MRI Ruby.

Github项目: https://github.com/afrankel/limitedBandwidth (注意:项目为不仅要考虑服务器上的多进程/线程问题

Github project: https://github.com/afrankel/limitedBandwidth (note: project is looking at more than this question of multi-process/threading on the server)

推荐答案

我邀请您阅读Jesse Storimer的

I invite you to read the series of Jesse Storimer's Nobody understands the GIL It might help you understand better some MRI internals.

我还发现使用Ruby进行实用并发，它的内容很有趣.它具有一些同时进行测试的示例.

I have also found Pragmatic Concurrency with Ruby, which reads interesting. It has some examples of testing concurrently.

另外，我可以推荐文章删除config.threadsafe！可能与Rails 4不相关，但是它解释了配置选项，您可以使用其中之一来允许并发.

In addition I can recommend the article Removing config.threadsafe! Might not be relevant for Rails 4, but it explains the configuration options, one of which you can use to allow concurrency.

让我们讨论您问题的答案.

Let's discuss the answer to your question.

即使使用Puma，您也可以有多个线程(使用MRI). GIL确保一次仅一个线程处于活动状态，这就是开发人员将其称为限制性的约束(因为没有真正的并行执行).请记住，GIL不保证线程安全. 这并不意味着其他线程没有在运行，而是在等待轮到他们.它们可以交错(文章可以帮助更好地理解).

You can have several threads (using MRI), even with Puma. The GIL ensures that only one thread is active at a time, that is the constraint that developers dub as restrictive (because of no real parallel execution). Bear in mind that GIL does not guarantee thread safety. This does not mean that the other threads are not running, they are waiting for their turn. They can interleave (the articles can help understanding better).

让我澄清一些术语:工作进程，线程. 进程在单独的内存空间中运行，可以服务多个线程. 同一进程的线程在共享内存空间中运行，这就是它们的进程.对于线程，在此上下文中，我们指的是Ruby线程，而不是CPU线程.

Let me clear up some terms: worker process, thread. A process runs in a separate memory space and can serve several threads. Threads of the same process run in a shared memory space, which is that of their process. With threads we mean Ruby threads in this context, not CPU threads.

关于您问题的配置和您共享的GitHub存储库，我认为一个适当的配置(我使用Puma)是设置4个工作程序和1至40个线程.这个想法是由一个工人提供一个选项卡.每个标签最多可发送10个请求.

In regards to your question's configuration and the GitHub repo you shared, I think an appropriate configuration (I used Puma) is to set up 4 workers and 1 to 40 threads. The idea is that one worker serves one tab. Each tab sends up to 10 requests.

所以我们开始吧:

我在虚拟机上使用Ubuntu.因此，首先，我在虚拟机的设置中启用了4个核心(以及我认为可能会有帮助的其他一些设置). 我可以在我的机器上验证这一点.所以我同意了.

I work on Ubuntu on a virtual machine. So first I enabled the 4 cores in my virtual machine's setting (and some other settings of which I thought it might help). I could verify this on my machine. So I went with that.

Linux command --> lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 69
Stepping:              1
CPU MHz:               2306.141
BogoMIPS:              4612.28
L1d cache:             32K
L1d cache:             32K
L2d cache:             6144K
NUMA node0 CPU(s):     0-3

我使用了您共享的GitHub项目，并对其做了一些修改.我创建了一个名为puma.rb的Puma配置文件(将其放置在config目录中)，其内容如下:

I used your shared GitHub project and modified it slightly. I created a Puma configuration file named puma.rb (put it in the config directory) with the following content:

workers Integer(ENV['WEB_CONCURRENCY'] || 1)
threads_count = Integer(ENV['MAX_THREADS'] || 1)
threads 1, threads_count

preload_app!

rackup      DefaultRackup
port        ENV['PORT']     || 3000
environment ENV['RACK_ENV'] || 'development'

on_worker_boot do
  # Worker specific setup for Rails 4.1+
  # See: https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server#on-worker-boot
  #ActiveRecord::Base.establish_connection
end

默认情况下，Puma以1个工作人员和1个线程启动.您可以使用环境变量来修改那些参数.我是这样做的:

By default Puma is started with 1 worker and 1 thread. You can use environment variables to modify those parameters. I did so:

export MAX_THREADS=40
export WEB_CONCURRENCY=4

要使用此配置启动Puma，

To start Puma with this configuration I typed

bundle exec puma -C config/puma.rb

在Rails应用程序目录中.

in the Rails app directory.

我打开了带有四个选项卡的浏览器，以调用应用程序的URL.

I opened the browser with four tabs to call the app's URL.

第一个请求在15:45:05左右开始，最后一个请求在15h49:44左右开始.那是经过4分39秒的时间. 您还可以在日志文件中按未排序的顺序查看请求的ID. (请参见下文)

The first request started around 15:45:05 and the last request was around 15h49:44. That is an elapsed time of 4 minutes and 39 seconds. Also you can see the request's id's in non sorted order in the log file. (See below)

GitHub项目中的每个API调用都会休眠15秒.我们有四个4个标签，每个标签有10个API调用.这样一来，最长经过时间为600秒，即10分钟(在严格的串行模式下).

Each API call in the GitHub project sleeps for 15 seconds. We have four 4 tabs, each with 10 API calls. That makes a maximum elapsed time of 600 seconds, i.e. 10 minutes (in a strictly serial mode).

理论上理想的结果将是并行的，并且经过的时间不超过15秒，但是我完全没有想到. 我不确定确切的结果是什么，但是我仍然感到非常惊讶(考虑到我在虚拟机上运行并且MRI受GIL和其他一些因素的限制).该测试的耗时少于最大耗时的一半(在严格的串行模式下)，我们将结果减少了一半.

The ideal result in theory would be all in parallel and an elapsed time not far from 15 seconds, but I did not expect that at all. I was not sure what to expect as a result exactly, but I was still positively surprised (considering that I ran on a virtual machine and MRI is restrained by the GIL and some other factors). The elapsed time of this test was less than half the maximum elapsed time (in strictly serial mode), we cut the result into less than half.

编辑，我进一步阅读了有关Rack :: Lock的知识，该文件将互斥体包装在每个请求周围(上面的第三篇文章).我找到了选择 config.allow_concurrency = true可以节省时间.一点警告是增加连接池(尽管请求不查询数据库必须进行相应的设置)；最大线程数是一个很好的默认值.在这种情况下为40.

EDIT I read further about the Rack::Lock that wraps a mutex around each request (Third article above). I found the option config.allow_concurrency = true to be a time saver. A little caveat was to increase the connection pool (though the request do no query the database had to be set accordingly); the number of maximum threads is a good default. 40 in this case.

我用jRuby测试了该应用程序，实际耗时为2分钟，带有allow_concurrency = true.

I tested the app with jRuby and the actual elapsed time was 2mins, with allow_concurrency=true.

我用MRI测试了该应用程序，实际耗时为1分钟47秒，使用allow_concurrency = true.这对我来说是一个很大的惊喜. 这真的让我感到惊讶，因为我期望MRI比JRuby慢.它不是.这使我质疑有关MRI和JRuby之间的速度差异的广泛讨论.

I tested the app with MRI and the actual elapsed time was 1min47s, with allow_concurrency=true. This was a big surprise to me. This really surprised me, because I expected MRI to be slower than JRuby. It was not. This makes me questioning the widespread discussion about the speed differences between MRI and JRuby.

现在观看不同"选项卡上的响应是更加随机的".碰巧在我首先要求的选项卡1之前，选项卡3或4完成了.

Watching the responses on the different tabs are "more random" now. It happens that tab 3 or 4 completes before tab 1, which I requested first.

我认为，因为您没有比赛条件，所以测试似乎好的.但是，对于以下情况，我不确定应用程序范围的后果您可以在实际应用程序中设置config.allow_concurrency = true.

I think because you don't have race conditions the test seems to be OK. However, I am not sure about the application wide consequences if you set config.allow_concurrency=true in a real world application.

请随时进行检查，并让我知道您的读者可能有任何反馈. 我的机器上仍然有克隆.让我知道您是否有兴趣.

Feel free to check it out and let me know any feedback you readers might have. I still have the clone on my machine. Let me know if you are interested.

要按顺序回答您的问题，

To answer your questions in order:

我认为您的示例在结果上是有效的.但是，对于并发，最好使用共享资源进行测试(例如第二篇文章中的内容).
关于您的陈述，如本文开头所述答案是:MRI是多线程的，但受GIL限制为一个活动一次线程.这就提出了一个问题:核磁共振成像不是更好吗用更多的进程和更少的线程进行测试?我真的不知道最初的猜测不会有很大的不同，也不会有很大的不同.也许有人可以阐明这一点.
您认为您的榜样很好.只是需要一些轻微修改.

I think your example is valid by result. For concurrency however, it is better to test with shared resources (as for example in the second article).
In regards to your statements, as mentioned in the beginning of this answer, MRI is multi-threaded, but restricted by GIL to one active thread at a time. This raises the question: With MRI isn't it better to test with more processes and less threads? I do not know really, a first guess would be rather no or not much of a difference. Maybe someone can shed light on this.
Your example is just fine I think. Just needed some slight modifications.

日志文件Rails应用程序:

Log file Rails app:

**config.allow_concurrency = false (by default)**
-> Ideally 1 worker per core, each worker servers up to 10 threads.

[3045] Puma starting in cluster mode...
[3045] * Version 2.11.2 (ruby 2.1.5-p273), codename: Intrepid Squirrel
[3045] * Min threads: 1, max threads: 40
[3045] * Environment: development
[3045] * Process workers: 4
[3045] * Preloading application
[3045] * Listening on tcp://0.0.0.0:3000
[3045] Use Ctrl-C to stop
[3045] - Worker 0 (pid: 3075) booted, phase: 0
[3045] - Worker 1 (pid: 3080) booted, phase: 0
[3045] - Worker 2 (pid: 3087) booted, phase: 0
[3045] - Worker 3 (pid: 3098) booted, phase: 0
Started GET "/assets/angular-ui-router/release/angular-ui-router.js?body=1" for 127.0.0.1 at 2015-05-11 15:45:05 +0800
...
...
...
Processing by ApplicationController#api_call as JSON
  Parameters: {"t"=>"15?id=9"}
Completed 200 OK in 15002ms (Views: 0.2ms | ActiveRecord: 0.0ms)
[3075] 127.0.0.1 - - [11/May/2015:15:49:44 +0800] "GET /api_call.json?t=15?id=9 HTTP/1.1" 304 - 60.0230

**config.allow_concurrency = true**
-> Ideally 1 worker per core, each worker servers up to 10 threads.

[22802] Puma starting in cluster mode...
[22802] * Version 2.11.2 (ruby 2.2.0-p0), codename: Intrepid Squirrel
[22802] * Min threads: 1, max threads: 40
[22802] * Environment: development
[22802] * Process workers: 4
[22802] * Preloading application
[22802] * Listening on tcp://0.0.0.0:3000
[22802] Use Ctrl-C to stop
[22802] - Worker 0 (pid: 22832) booted, phase: 0
[22802] - Worker 1 (pid: 22835) booted, phase: 0
[22802] - Worker 3 (pid: 22852) booted, phase: 0
[22802] - Worker 2 (pid: 22843) booted, phase: 0
Started GET "/" for 127.0.0.1 at 2015-05-13 17:58:20 +0800
Processing by ApplicationController#index as HTML
  Rendered application/index.html.erb within layouts/application (3.6ms)
Completed 200 OK in 216ms (Views: 200.0ms | ActiveRecord: 0.0ms)
[22832] 127.0.0.1 - - [13/May/2015:17:58:20 +0800] "GET / HTTP/1.1" 200 - 0.8190
...
...
...
Completed 200 OK in 15003ms (Views: 0.1ms | ActiveRecord: 0.0ms)
[22852] 127.0.0.1 - - [13/May/2015:18:00:07 +0800] "GET /api_call.json?t=15?id=10 HTTP/1.1" 304 - 15.0103

**config.allow_concurrency = true (by default)**
-> Ideally each thread serves a request.

Puma starting in single mode...
* Version 2.11.2 (jruby 2.2.2), codename: Intrepid Squirrel
* Min threads: 1, max threads: 40
* Environment: development
NOTE: ActiveRecord 4.2 is not (yet) fully supported by AR-JDBC, please help us finish 4.2 support - check http://bit.ly/jruby-42 for starters
* Listening on tcp://0.0.0.0:3000
Use Ctrl-C to stop
Started GET "/" for 127.0.0.1 at 2015-05-13 18:23:04 +0800
Processing by ApplicationController#index as HTML
  Rendered application/index.html.erb within layouts/application (35.0ms)
...
...
...
Completed 200 OK in 15020ms (Views: 0.7ms | ActiveRecord: 0.0ms)
127.0.0.1 - - [13/May/2015:18:25:19 +0800] "GET /api_call.json?t=15?id=9 HTTP/1.1" 304 - 15.0640

这篇关于MRI Ruby的并发请求的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

MRI Ruby的并发请求 [英] Concurrent requests with MRI Ruby

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

MRI Ruby的并发请求 [英] Concurrent requests with MRI Ruby

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭