如何调整在Heroku上运行的使用生产级别Heroku Postgres的Ruby on Rails应用程序? [英] How to tune a Ruby on Rails application running on Heroku which uses production level Heroku Postgres?

查看:107
本文介绍了如何调整在Heroku上运行的使用生产级别Heroku Postgres的Ruby on Rails应用程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我工作的公司决定将他们的整个堆栈移到Heroku。主要动机是它的易用性:没有sysAdmin,没有哭。但我仍然对它有一些疑问......



我正在应用程序平台和Postgres服务上进行一些负载和压力测试。我使用 blitz 作为Heroku的插件。我在网站上进行攻击,用户数在1到250之间。我得到了一些非常有趣的结果,我需要帮助进行评估。

测试堆栈:



应用程式规格



它没有任何特别之处。




  • Rails 4.0.4

  • Unicorn

  • database.yml



设置连接到Heroku postgres。

数据库



这是标准天狗(Heroku的命名约定会在一天内杀死我:)正确连接到应用程序。

Heroku configs



我应用了 unicorn.rb 中的所有内容,如在Unicorn上部署Rails应用程序文章。

  WEB_CONCURRENCY:2 
DB_POOL:5



数据




  • 剧集表格数量100.000〜

  • episode_urls 表格数量300.000〜

  • episode_images 表格数量75.000〜



代码



episodes_controller.rb

  def index 
@episodes = Episode.joins(:program).where(programs:{channel_id:1})。limit(100).includes(:episode_image,:episode_urls)
end

episodes / index.html.erb

 <%@ episodes.each do | t | %GT; 
<%if!t.episode_image.blank? %GT;
< li><%= image_tag(t.episode_image.image(:thumb))%>< / li>
<%end%>
< li><%= t.episode_urls.first.mas_path if!t.episode_urls.first.blank?%>< / li>
< li><%= t.title%>< / li>
<%end%>



情景#1:



  Web dynos:2 
持续时间:30秒
超时:8000毫秒
起始用户:10
最终用户:10



结果:



  HITS 100.00 %(484)
错误0.00%(0)
TIMEOUTS 0.00%(0)




这次冲击在30.00秒内产生了218次成功点击,我们
将6.04 MB数据转入和转出您的应用。每秒7.27美元的平均点击率
转化为每天点击627,840点。



方案2:



  Web dynos:2 
持续时间:30秒
超时:8000毫秒
开始用户:20
最终用户:20



结果:



  HITS 100.00%(484)
错误0.00%(0)
TIMEOUTS 0.00%(0)




这次冲击在30.00秒内产生了365次成功点击,我们
将10.12 MB数据转入和转出您的应用程序。平均命中12.7美元/秒的bb $ b转化为每天1054200点击次数。
平均响应时间为622毫秒。



情景3:



  Web dynos:2 
持续时间:30秒
超时:8000毫秒
起始用户:50
最终用户:50



结果:



  HITS 100.00%(484)
错误0.00%(0)
TIMEOUTS 0.00%(0)




这次冲击在30.00秒内产生了371次成功点击,我们
转移了10.29 MB数据进出应用程序。平均每秒12.37美元的bb美元汇率转化为每天约1,068,480次点击。
平均响应时间为2,631 ms。



情景4:



  Web dynos:4 
持续时间:30秒
超时:8000毫秒
起始用户:50
最终用户:50



结果:



  HITS 100.00%(484)
错误0.00%(0)
TIMEOUTS 0.00%(0)




这种冲击在30.00秒内产生484次成功点击,我们
将13.43 MB数据转入和转出您的应用程序。平均每秒16.13美元的b b美元汇率折算为每天约1,393,920美元。
平均响应时间为1,856毫秒。



情景5:



  Web dynos:4 
持续时间:30秒
超时:8000毫秒
开始用户:150
最终用户:150



结果:



  HITS 71.22%(386)
错误0.00%(0)
TIMEOUTS 28.78%(156)




这次冲击在30.00秒内产生386次成功点击,我们
将10.76 MB数据转入和转出您的应用程序。平均每磅12.87美元的bb美元汇率转化为每天约1,111,680次点击。
平均响应时间为5,446毫秒。



情景#6:



  Web dynos:10 
持续时间:30秒
超时:8000毫秒
开始用户:150
最终用户:150



结果:



  HITS 73.79%(428)
错误0.17%(1)
超时26.03%(151)




这次冲击在30.00秒内产生了428次成功点击,我们
将11.92 MB数据转入和转出您的应用程序。平均每秒14.27美元的bb美元汇率折算为每天约1,232,640次点击。
平均响应时间为4,793毫秒。您遇到了更大的问题,即
:26.21%的用户在此次抢注中遇到超时或
错误!



总结:


  • 即使150位用户向应用程序发送请求,命中率也不会超过15。


  • 问题:


  • 增加web dynos数量无助于处理请求。 >


    1. 当我使用缓存和memcached(来自Heroku的Memcachier插件)时,甚至2个Web dynos也能处理每秒180次以上的点击率。我只是想了解什么可以dynos和postgres服务可以不用缓存。通过这种方式,我试图了解如何调整它们。如何做到这一点?

    2. 标准的Tengu据说有200个并发连接。那么,为什么它永远不会达到这个数字? 如果有一个产品级数据库和增加的网络dynos将无助于扩展我的应用程序,有什么用处Heroku?


    3. 可能是最重要的问题:我做错了什么? :)

    感谢您阅读这个疯狂的问题!



    首先,记住我在视图中的代码:

     <%@ episodes.each do | t | %GT; 
    <%if!t.episode_image.blank? %GT;
    < li><%= image_tag(t.episode_image.image(:thumb))%>< / li>
    <%end%>
    < li><%= t.episode_urls.first.mas_path if!t.episode_urls.first.blank?%>< / li>
    < li><%= t.title%>< / li>
    <%end%>

    在这里我得到每集 episode_image 在我的迭代中。尽管我在控制器中使用 includes ,但在我的表模式中存在一个很大的错误。 我的 episode_images 表格中没有索引 episode_id 。这导致了非常高的查询时间。我发现它使用New Relic的数据库报告。所有其他查询时间为0.5ms或2-3ms,但是 episode.episode_image 导致了将近6500ms!



    I don我不太了解查询时间和应用程序执行之间的关系,但是我在我的 episode_images 表中添加了索引,现在我可以清楚地看到它们之间的差异。如果你的数据库模式正确,你可能不会遇到任何通过Heroku进行扩展的问题。但是,任何一个dyno都无法帮助你设计一个设计错误的数据库。



    对于遇到同样问题的人,我想告诉你一些关于我的关系的发现在Heroku的Web dynos,Unicorn工作者和Postgresql活动连接之间:

    基本上,Heroku为您提供了一个dyno,它是一种具有1个核心和512MB RAM的小型虚拟机。在这个小虚拟机里,你的Unicorn服务器运行。独角兽有一个主流程和工作流程。您的每个Unicorn工作人员都与自己的现有Postgresql服务器建立了自己的永久连接(请不要忘记查看这个)它基本上意味着当你有一个Heroku测功机和3个独角兽工作人员时,你至少有4个活动连接。如果你有2个网页打印机,你至少有8个活动连接。



    假设你有一个标准的Tengu Postgres,并发连接数限制为200。如果你的db查询设计有问题,那么db和更多的dynos都不能在没有缓存的情况下保存你的数据......如果你有长时间运行的查询,除缓存之外别无选择。



    以上所有是我自己的发现,如果他们有任何问题,请通过您的意见提醒我。


    The Company I work for decided on moving their entire stack to Heroku. The main motivation was it's ease of use: No sysAdmin, no cry. But I still have some questions about it...

    I'm making some load and stress tests on both application platform and Postgres service. I'm using blitz as an addon of Heroku. I attack on the site with number of users between 1 to 250. There are some very interesting results I got and I need help on evaluating them.

    The Test Stack:

    Application specifications

    It hasn't anything that much special at all.

    • Rails 4.0.4
    • Unicorn
    • database.yml set up to connect to Heroku postgres.
    • Not using cache.

    Database

    It's a Standard Tengu (naming conventions of Heroku will kill me one day :) properly connected to the application.

    Heroku configs

    I applied everything on unicorn.rb as told in "Deploying Rails Applications With Unicorn" article. I have 2 regular web dynos.

    WEB_CONCURRENCY  : 2
    DB_POOL          : 5
    

    Data

    • episodes table counts 100.000~
    • episode_urls table counts 300.000~
    • episode_images table counts 75.000~

    Code

    episodes_controller.rb

      def index
        @episodes = Episode.joins(:program).where(programs: {channel_id: 1}).limit(100).includes(:episode_image, :episode_urls)
      end
    

    episodes/index.html.erb

    <% @episodes.each do |t| %>
    <% if !t.episode_image.blank? %>
    <li><%= image_tag(t.episode_image.image(:thumb)) %></li>
    <% end %>
    <li><%= t.episode_urls.first.mas_path if !t.episode_urls.first.blank?%></li>
    <li><%= t.title %></li>
    <% end %>
    

    Scenario #1:

    Web dynos   : 2
    Duration    : 30 seconds
    Timeout     : 8000 ms
    Start users : 10
    End users   : 10
    

    Result:

    HITS 100.00% (484)
    ERRORS 0.00% (0)
    TIMEOUTS 0.00% (0)
    

    This rush generated 218 successful hits in 30.00 seconds and we transferred 6.04 MB of data in and out of your app. The average hit rate of 7.27/second translates to about 627,840 hits/day.

    Scenario #2:

    Web dynos   : 2
    Duration    : 30 seconds
    Timeout     : 8000 ms
    Start users : 20
    End users   : 20
    

    Result:

    HITS 100.00% (484)
    ERRORS 0.00% (0)
    TIMEOUTS 0.00% (0)
    

    This rush generated 365 successful hits in 30.00 seconds and we transferred 10.12 MB of data in and out of your app. The average hit rate of 12.17/second translates to about 1,051,200 hits/day. The average response time was 622 ms.

    Scenario #3:

    Web dynos   : 2
    Duration    : 30 seconds
    Timeout     : 8000 ms
    Start users : 50
    End users   : 50
    

    Result:

    HITS 100.00% (484)
    ERRORS 0.00% (0)
    TIMEOUTS 0.00% (0)
    

    This rush generated 371 successful hits in 30.00 seconds and we transferred 10.29 MB of data in and out of your app. The average hit rate of 12.37/second translates to about 1,068,480 hits/day. The average response time was 2,631 ms.

    Scenario #4:

    Web dynos   : 4
    Duration    : 30 seconds
    Timeout     : 8000 ms
    Start users : 50
    End users   : 50
    

    Result:

    HITS 100.00% (484)
    ERRORS 0.00% (0)
    TIMEOUTS 0.00% (0)
    

    This rush generated 484 successful hits in 30.00 seconds and we transferred 13.43 MB of data in and out of your app. The average hit rate of 16.13/second translates to about 1,393,920 hits/day. The average response time was 1,856 ms.

    Scenario #5:

    Web dynos   : 4
    Duration    : 30 seconds
    Timeout     : 8000 ms
    Start users : 150
    End users   : 150
    

    Result:

    HITS 71.22% (386)
    ERRORS 0.00% (0)
    TIMEOUTS 28.78% (156)
    

    This rush generated 386 successful hits in 30.00 seconds and we transferred 10.76 MB of data in and out of your app. The average hit rate of 12.87/second translates to about 1,111,680 hits/day. The average response time was 5,446 ms.

    Scenario #6:

    Web dynos   : 10
    Duration    : 30 seconds
    Timeout     : 8000 ms
    Start users : 150
    End users   : 150
    

    Result:

    HITS 73.79% (428)
    ERRORS 0.17% (1)
    TIMEOUTS 26.03% (151)
    

    This rush generated 428 successful hits in 30.00 seconds and we transferred 11.92 MB of data in and out of your app. The average hit rate of 14.27/second translates to about 1,232,640 hits/day. The average response time was 4,793 ms. You've got bigger problems, though: 26.21% of the users during this rush experienced timeouts or errors!

    General Summary:

    • The "Hit Rate" never goes beyond the number of 15 even though 150 users sends request to the application.
    • Increasing number of web dynos does not help handling requests.

    Questions:

    1. When I use caching and memcached (Memcachier add-on from Heroku) even 2 web dynos can handle >180 hits per second. I'm just trying to understand what can dynos and the postgres service can do without cache. This way I'm trying to understand how to tune them. How to do it?

    2. Standard Tengu is said to have 200 concurrent connections. So why it never reaches that number?

    3. If having a prdouction level db and increasing web dynos won't help to scale my app, what's the point to use Heroku?

    4. Probably the most important question: What am I doing wrong? :)

    Thank you for even reading this crazy question!

    解决方案

    I particularly figured out the issue.

    Firstly, remember my code in the view:

    <% @episodes.each do |t| %>
    <% if !t.episode_image.blank? %>
    <li><%= image_tag(t.episode_image.image(:thumb)) %></li>
    <% end %>
    <li><%= t.episode_urls.first.mas_path if !t.episode_urls.first.blank?%></li>
    <li><%= t.title %></li>
    <% end %>
    

    Here I'm getting each episodes episode_image inside my iteration. Even though I've been using includes in my controller, there was a big mistake at my table schema. I did not have index for episode_id in my episode_images table!. This was causing an extremely high query time. I've found it using New Relic's database reports. All other query times were 0.5ms or 2-3ms but episode.episode_image was causing almost 6500ms!

    I don't know much about the relationship between query time and application execution but as I added index to my episode_images table, now I can clearly see the difference. If you have your database schema properly, you'll probably won't face any problem with scaling via Heroku. But any dyno can not help you with a badly designed database.

    For people who might run into same problem, I would like to tell you about some of my findings of relationship between Heroku web dynos, Unicorn workers and Postgresql active connections:

    Basically, Heroku provides you a dyno which is some kind of a small virtual machine having 1 core and 512MB ram. Inside that little virtual machine, your Unicorn server runs. Unicorn has a master process and worker processes. Each of your Unicorn workers has their own permanent connection to your existing Postgresql server (Don't forget to check out this) It basically means that when you have a Heroku dyno up with 3 Unicorn workers running on it, you have at least 4 active connections. If you have 2 web dynos, you have at least 8 active connections.

    Let's say you have a Standard Tengu Postgres with 200 concurrent connections limit. If you have problematic queries with bad db design neither can db nor more dynos can save you without cache... If you have long running queries you have no choice other than caching, I think.

    All above is my own findings, if there is anything wrong with them please warn me by your comments.

    这篇关于如何调整在Heroku上运行的使用生产级别Heroku Postgres的Ruby on Rails应用程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文

登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆