流星节点进程CPU使用率接近100% [英] Meteor Node Process CPU Usage Nears 100%
问题描述
当Meteor应用达到峰值流量时,我遇到了麻烦(请注意,这没什么,每天访问1000次,也许一天有2500次网页浏览). CPU使用率会激增并且永远不会恢复,因此我开始使用Nodetime监视使用率,并且一直在重新加载进程(forever restart
)以使一切恢复正常.
I'm having trouble with my Meteor app when it gets to its peak amount of traffic (peak for this is nothing, 1k visits, maybe 2,500 pageviews in a day). CPU usage spikes and never recovers, so I've taken to using Nodetime to monitor usage and I've been reloading the process (forever restart
) to get things back to normal.
我对配置文件还很陌生,因此找到根本原因使我无所适从.我相当确定它与我的应用程序的服务器代码有关,但是性能分析似乎将Fibers模块指向热点",据我了解,它有助于使服务器代码同步.
I'm fairly new to profiling, so finding the underlying cause has me at a loss for where to start. I'm fairly certain it has to do with my app's server code, but the profiling seems to point to the Fibers module as a "hotspot" which I understand aids in making my server code synchronous.
以下是分析结果的摘要.我希望有人可以引导我朝着正确的方向进行故障排除!
Below is a snippet from the profiling results. I hope someone can guide me in the right direction in troubleshooting this!
推荐答案
虽然我对您的问题没有特定的答案,但是我有处理生产流星应用程序的CPU问题的经验,因此我可以给您一个列表要调查的事情.
While I don't have a specific answer to your question, I have experience dealing with CPU issues for our production meteor app for so I can give you a list of things to investigate.
Upgrade to the latest version of meteor and the appropriate node version (see the changelog). As of this writing that's meteor 0.8.2 and node 0.10.28.
阅读此和
Read this and this article. The latter makes a great point that you really should always try to delay activation of subscriptions until you need them. In particular you may not need to publish anything for users who are not logged in. In my experience, meteor CPU problems have everything to do with subscriptions.
请小心使用observe
和observeChanges
.这些都是昂贵,并且容易滥用.特别是:
Be careful with observe
and observeChanges
. These are expensive and are easy to abuse. In particular:
- 请确保在不再需要它们时在其句柄上调用
stop()
(考虑使用类似与关系发布,为您完成此操作). - 仅获取您绝对需要的集合和字段.通过不断地扩散对象来观察工作(需要大量的CPU).您拥有的对象越少,计算的内容就越少.
- Make sure you are calling
stop()
on your handles when they are no longer needed (consider using a package like publish-with-relations so this is done for you). - Fetch only the collections and fields that you absolutely need. Observe works by continually diffing objects (requires lots of CPU). The fewer and smaller objects you have, the less there is to compute.
在 smart-collections 之前考虑使用 smart-collections href ="http://meteorhacks.com/retiring-smart-collections.html" rel ="noreferrer">已退休. 使用 oplog拖尾-这可以使一个晚上和应用中的性能和CPU使用情况之间的日差.
Consider using smart-collections before it is retired. Use oplog tailing - this can make for a night and day difference in performance and CPU usage in your app.
请考虑使某些事情不起作用(在以上文章中也提到过).对我们来说,这是一个巨大的胜利.我们有一个非常昂贵的联接,该联接在站点上两个经常访问的页面上使用.当达到大约每30分钟将CPU固定为100%的程度时,我放弃了该元素的反应性,只是在服务器上进行了连接,并通过方法调用将数据发送给了客户端.我还为这些结果创建了一个服务器端过期缓存,并按用户存储了它们(特别感谢Matt DeBergalis的建议).
Consider making some things not reactive (also mentioned in the articles above). For us that was a big win. We had one extremely expensive join that was used on two frequently accessed pages on the site. When it got to the point where the CPU was pegged at 100% about every 30 minutes I gave up on reactivity for that element and just did the join on the server and shipped the data to the client via a method call. I also created a server-side expiring cache for these results and stored them by user (special thanks to Matt DeBergalis for this suggestion).
每晚进行一次预防性重启.我有一份cron作业,告诉forever
每天半夜重新启动我们的应用程序.这会使CPU从约10%下降到1%.这似乎是不可思议的事情,但是在重置后CPU使用率发生变化的事实使我相信这是个好主意.
Do a preventative nightly restart. I have a cron job that tells forever
to restart our app once a day in the middle of the night. That brings the CPU down from ~10% to 1%. This seems like black magic, but the fact that the CPU usage changes after a reset leads me to believe this is a good idea.
更新后的想法(1/13/14)
-
我们尽快迁移到 oplog拖尾可用(流星0.7),这有很大的不同.请注意,为了访问操作日志,您可能需要托管自己的数据库或在您选择的托管提供程序上运行专用实例.我还建议添加
Updated thoughts (1/13/14)
We migrated to oplog tailing as soon as it was available (meteor 0.7) and that made a big difference. Note that in order to get access to the oplog, you'll probably need to either host your own db or run a dedicated instance on the hosting provider of your choice. I'd also recommend adding the
facts
package to actually tell if its working.在
publish-with-relations
中发现了内存泄漏,并且在撰写本文时,大气层版本(v0.1.5)并未受到影响以反映这些更改.如果在生产中使用它,强烈建议您检出HEAD版本并在本地运行.There was a memory leak discovered in
publish-with-relations
, and as of this writing the atmosphere version (v0.1.5) hasn't been bumped to reflect these changes. If you are using it in production, I strongly recommend checking out the HEAD version and running it locally.几周前,我们停止了每晚重新启动.到目前为止,一切都很好(手指交叉).
We stopped doing nightly restarts a couple of weeks ago. So far everything has been fine (fingers crossed).
A few months ago we switched over to using an Elastic Deployment on mongohq. It's affordable, the performance has been great, and they even have a blog post which tells you how to enable oplog tailing.
我强烈建议您检出 kadira ,以帮助诊断应用程序中的性能问题.另外,请查看学院文章,其中有很多不错的技巧.
I'd strongly recommend checking out kadira to help diagnose performance issues in your app. Also check out the academy articles which have a number of good tips in them.
这篇关于流星节点进程CPU使用率接近100%的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!