Meteor 节点进程 CPU 使用率接近 100% [英] Meteor Node Process CPU Usage Nears 100%

查看:20
本文介绍了Meteor 节点进程 CPU 使用率接近 100%的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我的 Meteor 应用达到峰值流量时,我遇到了问题(峰值没什么,1000 次访问,一天可能有 2,500 次综合浏览量).CPU 使用率飙升并且永远不会恢复,所以我开始使用 Nodetime 来监控使用情况,并且我一直在重新加载进程(永远重启)以使事情恢复正常.

I'm having trouble with my Meteor app when it gets to its peak amount of traffic (peak for this is nothing, 1k visits, maybe 2,500 pageviews in a day). CPU usage spikes and never recovers, so I've taken to using Nodetime to monitor usage and I've been reloading the process (forever restart) to get things back to normal.

我对分析还很陌生,所以找到根本原因让我不知从何开始.我相当肯定这与我的应用程序的服务器代码有关,但分析似乎将 Fibers 模块指向为热点",我认为这有助于使我的服务器代码同步.

I'm fairly new to profiling, so finding the underlying cause has me at a loss for where to start. I'm fairly certain it has to do with my app's server code, but the profiling seems to point to the Fibers module as a "hotspot" which I understand aids in making my server code synchronous.

以下是分析结果的片段.我希望有人能指导我朝着正确的方向进行故障排除!

Below is a snippet from the profiling results. I hope someone can guide me in the right direction in troubleshooting this!

推荐答案

虽然我没有具体的答案来回答你的问题,但我有处理我们生产流星应用程序 CPU 问题的经验,所以我可以给你一个列表要调查的事情.

While I don't have a specific answer to your question, I have experience dealing with CPU issues for our production meteor app for so I can give you a list of things to investigate.

  1. 升级到最新版本的meteor 和相应的节点版本(参见更改日志).在撰写本文时,这是流星 0.8.2 和节点 0.10.28.

  1. Upgrade to the latest version of meteor and the appropriate node version (see the changelog). As of this writing that's meteor 0.8.2 and node 0.10.28.

阅读this这篇 文章.后者提出了一个很好的观点,即您确实应该始终尝试延迟订阅的激活,直到您需要它们为止.特别是您可能不需要为未登录的用户发布任何内容.根据我的经验,meteor CPU 问题与订阅有关.

Read this and this article. The latter makes a great point that you really should always try to delay activation of subscriptions until you need them. In particular you may not need to publish anything for users who are not logged in. In my experience, meteor CPU problems have everything to do with subscriptions.

注意observeobserveChanges.它们昂贵并且很容易被滥用.特别是:

Be careful with observe and observeChanges. These are expensive and are easy to abuse. In particular:

  • 确保在不再需要句柄时调用 stop()(考虑使用像 publish-with-relations 所以这是为您完成的.
  • 仅获取您绝对需要的集合和字段.Observe 通过不断区分对象来工作(需要大量 CPU).您拥有的对象越少、越小,需要计算的内容就越少.
  • Make sure you are calling stop() on your handles when they are no longer needed (consider using a package like publish-with-relations so this is done for you).
  • Fetch only the collections and fields that you absolutely need. Observe works by continually diffing objects (requires lots of CPU). The fewer and smaller objects you have, the less there is to compute.

smart-collectionshref="http://meteorhacks.com/retiring-smart-collections.html" rel="noreferrer">退休. 使用 oplog 拖尾 - 这可以让一个晚上您应用中的性能和 CPU 使用率的日差.

Consider using smart-collections before it is retired. Use oplog tailing - this can make for a night and day difference in performance and CPU usage in your app.

考虑让一些事情不是反应性的(也在上面的文章中提到过).对我们来说,这是一个巨大的胜利.我们在站点上两个经常访问的页面上使用了一个非常昂贵的连接.当 CPU 大约每 30 分钟固定在 100% 时,我放弃了该元素的反应性,只是在服务器上进行了连接,并通过方法调用将数据传送到客户端.我还为这些结果创建了一个服务器端过期缓存,并由用户存储(特别感谢 Matt DeBergalis 提出这个建议).

Consider making some things not reactive (also mentioned in the articles above). For us that was a big win. We had one extremely expensive join that was used on two frequently accessed pages on the site. When it got to the point where the CPU was pegged at 100% about every 30 minutes I gave up on reactivity for that element and just did the join on the server and shipped the data to the client via a method call. I also created a server-side expiring cache for these results and stored them by user (special thanks to Matt DeBergalis for this suggestion).

每晚进行预防性重启.我有一个 cron 作业,它告诉 forever 每天在半夜重新启动我们的应用程序一次.这使 CPU 从 ~10% 下降到 1%.这看起来像是黑魔法,但 CPU 使用率在重置后发生变化这一事实让我相信这是个好主意.

Do a preventative nightly restart. I have a cron job that tells forever to restart our app once a day in the middle of the night. That brings the CPU down from ~10% to 1%. This seems like black magic, but the fact that the CPU usage changes after a reset leads me to believe this is a good idea.

更新想法 (1/13/14)

  • 我们尽快迁移到 oplog tailing可用(流星 0.7),这产生了很大的不同.请注意,为了访问 oplog,您可能需要托管自己的数据库或在您选择的托管服务提供商上运行专用实例.我还建议添加 facts 包以实际判断其是否正常工作.

    Updated thoughts (1/13/14)

    • We migrated to oplog tailing as soon as it was available (meteor 0.7) and that made a big difference. Note that in order to get access to the oplog, you'll probably need to either host your own db or run a dedicated instance on the hosting provider of your choice. I'd also recommend adding the facts package to actually tell if its working.

      publish-with-relations 中发现了内存泄漏,在撰写本文时,大气版本 (v0.1.5) 尚未更新以反映这些更改.如果您在生产中使用它,我强烈建议您查看 HEAD 版本并本地运行它.

      There was a memory leak discovered in publish-with-relations, and as of this writing the atmosphere version (v0.1.5) hasn't been bumped to reflect these changes. If you are using it in production, I strongly recommend checking out the HEAD version and running it locally.

      几周前我们停止了每晚重启.到目前为止一切都很好(手指交叉).

      We stopped doing nightly restarts a couple of weeks ago. So far everything has been fine (fingers crossed).

      • 几个月前,我们转而在 mongohq 上使用弹性部署.它价格实惠,性能非常好,他们甚至有一篇博客文章 告诉你如何开启 oplog tailing.

      • A few months ago we switched over to using an Elastic Deployment on mongohq. It's affordable, the performance has been great, and they even have a blog post which tells you how to enable oplog tailing.

      我强烈建议您查看 kadira 以帮助诊断应用中的性能问题.还可以查看 学术文章,其中包含许多有用的提示.

      I'd strongly recommend checking out kadira to help diagnose performance issues in your app. Also check out the academy articles which have a number of good tips in them.

      这篇关于Meteor 节点进程 CPU 使用率接近 100%的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆