优化基于Kohana的网站以提高速度和可扩展性 [英] Optimizing Kohana-based Websites for Speed and Scalability

查看:104
本文介绍了优化基于Kohana的网站以提高速度和可扩展性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我用Kohana建的一个网站昨天遭到了巨大的访问量,这使我退后一步并评估了一些设计.我很好奇什么是优化基于Kohana的应用程序的一些标准技术?

我也对基准测试感兴趣.我是否需要为每个控制器方法设置Benchmark::start()Benchmark::stop()以便查看所有页面的执行时间,还是可以在全球范围内快速应用基准测试?

我将在以后更多时候使用Cache库,但是我愿意接受更多建议,因为我敢肯定,目前我还不知道有很多事情可以做.

解决方案

我将在此答案中说的并不是Kohana特有的,并且可能适用于许多PHP项目.

在谈论性能,可伸缩性,PHP等时,我想到了以下几点:
在进行多个项目时,我已经使用了许多这样的想法-他们提供了帮助;这样他们也可以在这里提供帮助.


首先,在表演方面,有许多方面/要考虑的问题:

  • 服务器(Apache,PHP,MySQL,其他可能的守护程序和系统)的配置;您可能会在 ServerFault 上获得更多帮助,我想,
  • PHP代码,
  • 数据库查询,
  • 是否使用您的网络服务器?
  • 可以使用任何类型的缓存机制吗?还是您总是需要网站上更多的最新数据?


使用反向代理

真正有用的第一件事是使用反向代理,例如 ?

关于使用反向代理作为缓存,对于PHP应用程序,例如,您可以查看 我如何才能发现并生存"Slashdotted"? 可能是一个有趣的读物.


在PHP方面:

首先:您是否正在使用最新版本的PHP ?定期改进速度,使用新版本;-)
例如,看看 PHP Branches 3.0到5.3-CVS的基准 .

请注意,性能是使用PHP 5.3的充分理由.(手册),这是我的解决方案我见过使用最多的服务器,并且在我使用过的所有服务器上都使用过.

  • 另请参阅: 幻灯片APC Facebook
  • 配置选项 ,顺便说一句
    • 其中有很多,对您的速度/CPU负载/易用性都有很大的影响
    • 例如,禁用[apc.stat](https://php.net/manual/en/apc.configuration.php#ini.apc.stat)对系统负载很有利;但这意味着除非刷新整个操作码缓存,否则不会考虑对PHP文件所做的修改;关于此信息,有关更多详细信息,请参见例如 To stat()或不去stat()吗?


使用数据缓存

尽可能避免一遍又一遍地做同一件事.

我主要要考虑的是SQL查询:您的许多页面可能执行相同的查询,而其中某些结果几乎总是相同的……这意味着很多<对数据库进行的em>无用" 查询,这需要花费时间来一次又一次地为相同的数据提供服务.
当然,其他内容也是如此,例如Web服务调用,从其他网站获取信息,繁重的计算工作……

您可能会很感兴趣地识别:

  • 哪些查询运行了很多次,总是返回相同的数据
  • 其他(繁重)计算需要大量时间,并且总是返回相同的结果

并将这些数据/结果存储在某种类型的缓存中,因此更容易获得它们-更快 –而且,您不必为了任何事情而去SQL服务器.

例如,出色的缓存机制:

  • APC :除了我之前提到的操作码缓存外,它还允许您将数据存储在内存中,
  • 和/或 内存缓存 (另请参见),如果您确实拥有很多项,这将非常有用数据和/或使用多个服务器,因为它们已分发.
  • 当然,您可以考虑文件;可能还有很多其他想法.

我很确定您的框架中包含一些与缓存相关的内容;您可能已经知道,正如您所说,我将在OP中更多地使用Cache-library" ;-)


分析

现在,要做的一件好事是使用 Xdebug 扩展名配置您的应用程序:通常,它可以很容易地找到几个弱点-至少在某些功能需要花费大量时间的情况下.

配置正确 ,它将生成可分析的配置文件使用一些图形工具,例如:

  • KCachegrind :我的最爱,但仅适用在Linux/KDE上
  • Wincachegrind (适用于Windows);不幸的是,它的功能比KCacheGrind少一些-通常它不显示调用图.
  • Webgrind ,它可以在PHP网络服务器上运行,因此可以在任何地方使用- -但功能可能较少.

例如,这是KCacheGrind的几个屏幕截图:


(来源: pascal-martin.fr )

(来源: pascal-martin.fr )

(顺便说一句,如果我没有记错的话,第二张屏幕截图中显示的调用图通常是WinCacheGrind或Webgrind都无法执行的操作)


(感谢@Mikushi的评论)我未使用太多的另一种可能性是 xhprof 扩展:它也有助于进行性能分析,可以生成调用图-但比Xdebug轻,这意味着您应该可以在生产服务器上安装它.

您应该可以在 XHGui 中使用它,它将帮助可视化数据.


在SQL方面:

现在我们已经讨论了一些关于PHP的信息,请注意,您的瓶颈并非是PHP方面的东西,而是数据库的一个... >

至少有两三件事,在这里:

  • 您的MySQL配置是否正确?我对此了解不多,但是有些配置选项可能会产生一些影响.
  • 不过,最重要的两件事是:

    • 如果不需要,请不要进入数据库:尽可能多地缓存
    • 当您必须进入数据库时​​,请使用高效的查询:使用索引;使用索引.和个人资料!


    那又如何?

    如果您仍在阅读,还有什么可以优化的?

    嗯,还有改进的余地...一些面向体系结构的想法可能是:

    • 切换到n层架构:
      • 将MySQL放置在另一台服务器上(2层:一个用于PHP;另一个用于MySQL)
      • 使用多个PHP服务器(并在这些服务器之间平衡用户负载)
      • 使用其他计算机来处理静态文件,并使用较轻的网络服务器,例如:
      • 为MySQL使用多个服务器,为PHP使用多个服务器,并在其前面使用多个反向代理
      • 当然:在任何具有任何可用RAM的服务器上安装 memcached 守护程序,并使用它们来缓存力所能及//有道理.
    • 使用比Apache更高效"的东西吗?
      • 我越来越经常听到有关 nginx 的信息,这应该很棒当涉及到PHP和大量网站时;我从来没有亲自使用过它,但是您可能会在网上找到一些有趣的文章.

    好吧,在您的情况下,其中一些想法可能有些过分了^^
    但是,还是……为什么不研究一下,以防万一? ;-)


    那Kohana呢?

    您最初的问题是关于优化使用Kohana的应用程序...嗯,我发布了一些适用于任何PHP应用程序的想法 ...这意味着它们也适用于Kohana. ;-)
    (即使不是特定于^^)

    我说:使用缓存; Kohana似乎支持某些缓存内容 (您自己说过,所以没有新内容这里...)
    如果有什么可以快速完成的操作,请尝试;-)

    我还说过,您不应该做任何不必要的事情;默认情况下,Kohana中是否有不需要的功能?
    浏览网络时,似乎至少有一些关于XSS过滤的知识;你需要那个吗?

    仍然,这里有一些可能有用的链接:


    结论?

    最后,得出一个简单的想法:

    • 贵公司向您支付5天的费用是多少? -考虑到进行一些重大优化是合理的时间
    • 您的公司购买(付费?)第二台服务器及其维护费用是多少?
    • 如果必须扩大规模怎么办?
      • 花10天要花多少钱?更多的?优化您的应用程序的每一点?
      • 还有几台服务器要多少钱?

    我并不是说您不应该优化:您绝对应该!
    但是首先进行快速"优化,这将为您带来丰厚的回报:使用某些操作码缓存可能会帮助您减少服务器CPU负载的10%到50%.只需几分钟; ;-)另一方面,花3天的时间得到2%...

    哦,顺便说一句:在做任何事情之前:放置一些监视内容,这样您就知道已进行了哪些改进,以及如何进行了改进!
    如果没有监视,您将不会知道所做工作的效果...即使是真正的优化也不会!

    例如,您可以使用 RRDtool + 仙人掌 .
    向老板展示一些不错的图形,CPU负载下降40%总是很棒的;-)


    无论如何,要真正得出结论:玩得开心!
    (是的,优化很有趣!)
    (嗯,我想我不会写那么多...希望至少其中的某些部分有用...而且我应该记住这个答案:在其他时候可能有用... )

    A site I built with Kohana was slammed with an enormous amount of traffic yesterday, causing me to take a step back and evaluate some of the design. I'm curious what are some standard techniques for optimizing Kohana-based applications?

    I'm interested in benchmarking as well. Do I need to setup Benchmark::start() and Benchmark::stop() for each controller-method in order to see execution times for all pages, or am I able to apply benchmarking globally and quickly?

    I will be using the Cache-library more in time to come, but I am open to more suggestions as I'm sure there's a lot I can do that I'm simply not aware of at the moment.

    解决方案

    What I will say in this answer is not specific to Kohana, and can probably apply to lots of PHP projects.

    Here are some points that come to my mind when talking about performance, scalability, PHP, ...
    I've used many of those ideas while working on several projects -- and they helped; so they could probably help here too.


    First of all, when it comes to performances, there are many aspects/questions that are to consider:

    • configuration of the server (both Apache, PHP, MySQL, other possible daemons, and system); you might get more help about that on ServerFault, I suppose,
    • PHP code,
    • Database queries,
    • Using or not your webserver?
    • Can you use any kind of caching mechanism? Or do you need always more that up to date data on the website?


    Using a reverse proxy

    The first thing that could be really useful is using a reverse proxy, like varnish, in front of your webserver: let it cache as many things as possible, so only requests that really need PHP/MySQL calculations (and, of course, some other requests, when they are not in the cache of the proxy) make it to Apache/PHP/MySQL.

    • First of all, your CSS/Javascript/Images -- well, everything that is static -- probably don't need to be always served by Apache
      • So, you can have the reverse proxy cache all those.
      • Serving those static files is no big deal for Apache, but the less it has to work for those, the more it will be able to do with PHP.
      • Remember: Apache can only server a finite, limited, number of requests at a time.
    • Then, have the reverse proxy serve as many PHP-pages as possible from cache: there are probably some pages that don't change that often, and could be served from cache. Instead of using some PHP-based cache, why not let another, lighter, server serve those (and fetch them from the PHP server from time to time, so they are always almost up to date)?
      • For instance, if you have some RSS feeds (We generally tend to forget those, when trying to optimize for performances) that are requested very often, having them in cache for a couple of minutes could save hundreds/thousands of request to Apache+PHP+MySQL!
      • Same for the most visited pages of your site, if they don't change for at least a couple of minutes (example: homepage?), then, no need to waste CPU re-generating them each time a user requests them.
    • Maybe there is a difference between pages served for anonymous users (the same page for all anonymous users) and pages served for identified users ("Hello Mr X, you have new messages", for instance)?
      • If so, you can probably configure the reverse proxy to cache the page that is served for anonymous users (based on a cookie, like the session cookie, typically)
      • It'll mean that Apache+PHP has less to deal with: only identified users -- which might be only a small part of your users.

    About using a reverse-proxy as cache, for a PHP application, you can, for instance, take a look at Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
    (Yep, they are using Squid, and I was talking about varnish -- that's just another possibility ^^ Varnish being more recent, but more dedicated to caching)

    If you do that well enough, and manage to stop re-generating too many pages again and again, maybe you won't even have to optimize any of your code ;-)
    At least, maybe not in any kind of rush... And it's always better to perform optimizations when you are not under too much presure...


    As a sidenote: you are saying in the OP:

    A site I built with Kohana was slammed with an enormous amount of traffic yesterday,

    This is the kind of sudden situation where a reverse-proxy can literally save the day, if your website can deal with not being up to date by the second:

    • install it, configure it, let it always -- every normal day -- run:
      • Configure it to not keep PHP pages in cache; or only for a short duration; this way, you always have up to date data displayed
    • And, the day you take a slashdot or digg effect:
      • Configure the reverse proxy to keep PHP pages in cache; or for a longer period of time; maybe your pages will not be up to date by the second, but it will allow your website to survive the digg-effect!

    About that, How can I detect and survive being "Slashdotted"? might be an interesting read.


    On the PHP side of things:

    First of all: are you using a recent version of PHP? There are regularly improvements in speed, with new versions ;-)
    For instance, take a look at Benchmark of PHP Branches 3.0 through 5.3-CVS.

    Note that performances is quite a good reason to use PHP 5.3 (I've made some benchmarks (in French), and results are great)...
    Another pretty good reason being, of course, that PHP 5.2 has reached its end of life, and is not maintained anymore!

    Are you using any opcode cache?

    • I'm thinking about APC - Alternative PHP Cache, for instance (pecl, manual), which is the solution I've seen used the most -- and that is used on all servers on which I've worked.
    • It can really lower the CPU-load of a server a lot, in some cases (I've seen CPU-load on some servers go from 80% to 40%, just by installing APC and activating it's opcode-cache functionality!)
    • Basically, execution of a PHP script goes in two steps:
      • Compilation of the PHP source-code to opcodes (kind of an equivalent of JAVA's bytecode)
      • Execution of those opcodes
      • APC keeps those in memory, so there is less work to be done each time a PHP script/file is executed: only fetch the opcodes from RAM, and execute them.
    • You might need to take a look at APC's configuration options, by the way
      • there are quite a few of those, and some can have a great impact on both speed / CPU-load / ease of use for you
      • For instance, disabling [apc.stat](https://php.net/manual/en/apc.configuration.php#ini.apc.stat) can be good for system-load; but it means modifications made to PHP files won't be take into account unless you flush the whole opcode-cache; about that, for more details, see for instance To stat() Or Not To stat()?


    Using cache for data

    As much as possible, it is better to avoid doing the same thing over and over again.

    The main thing I'm thinking about is, of course, SQL Queries: many of your pages probably do the same queries, and the results of some of those is probably almost always the same... Which means lots of "useless" queries made to the database, which has to spend time serving the same data over and over again.
    Of course, this is true for other stuff, like Web Services calls, fetching information from other websites, heavy calculations, ...

    It might be very interesting for you to identify:

    • Which queries are run lots of times, always returning the same data
    • Which other (heavy) calculations are done lots of time, always returning the same result

    And store these data/results in some kind of cache, so they are easier to get -- faster -- and you don't have to go to your SQL server for "nothing".

    Great caching mechanisms are, for instance:

    • APC: in addition to the opcode-cache I talked about earlier, it allows you to store data in memory,
    • And/or memcached (see also), which is very useful if you literally have lots of data and/or are using multiple servers, as it is distributed.
    • of course, you can think about files; and probably many other ideas.

    I'm pretty sure your framework comes with some cache-related stuff; you probably already know that, as you said "I will be using the Cache-library more in time to come" in the OP ;-)


    Profiling

    Now, a nice thing to do would be to use the Xdebug extension to profile your application: it often allows to find a couple of weak-spots quite easily -- at least, if there is any function that takes lots of time.

    Configured properly, it will generate profiling files that can be analysed with some graphic tools, such as:

    • KCachegrind: my favorite, but works only on Linux/KDE
    • Wincachegrind for windows; it does a bit less stuff than KCacheGrind, unfortunately -- it doesn't display callgraphs, typically.
    • Webgrind which runs on a PHP webserver, so works anywhere -- but probably has less features.

    For instance, here are a couple screenshots of KCacheGrind:


    (source: pascal-martin.fr)

    (source: pascal-martin.fr)

    (BTW, the callgraph presented on the second screenshot is typically something neither WinCacheGrind nor Webgrind can do, if I remember correctly ^^ )


    (Thanks @Mikushi for the comment) Another possibility that I haven't used much is the the xhprof extension : it also helps with profiling, can generate callgraphs -- but is lighter than Xdebug, which mean you should be able to install it on a production server.

    You should be able to use it alonside XHGui, which will help for the visualisation of data.


    On the SQL side of things:

    Now that we've spoken a bit about PHP, note that it is more than possible that your bottleneck isn't the PHP-side of things, but the database one...

    At least two or three things, here:

    • You should determine:
      • What are the most frequent queries your application is doing
      • Whether those are optimized (using the right indexes, mainly?), using the EXPLAIN instruction, if you are using MySQL
      • whether you could cache some of these queries (see what I said earlier)
    • Is your MySQL well configured? I don't know much about that, but there are some configuration options that might have some impact.

    Still, the two most important things are:

    • Don't go to the DB if you don't need to: cache as much as you can!
    • When you have to go to the DB, use efficient queries: use indexes; and profile!


    And what now?

    If you are still reading, what else could be optimized?

    Well, there is still room for improvements... A couple of architecture-oriented ideas might be:

    • Switch to an n-tier architecture:
      • Put MySQL on another server (2-tier: one for PHP; the other for MySQL)
      • Use several PHP servers (and load-balance the users between those)
      • Use another machines for static files, with a lighter webserver, like:
        • lighttpd
        • or nginx -- this one is becoming more and more popular, btw.
      • Use several servers for MySQL, several servers for PHP, and several reverse-proxies in front of those
      • Of course: install memcached daemons on whatever server has any amount of free RAM, and use them to cache as much as you can / makes sense.
    • Use something "more efficient" that Apache?
      • I hear more and more often about nginx, which is supposed to be great when it comes to PHP and high-volume websites; I've never used it myself, but you might find some interesting articles about it on the net;

    Well, maybe some of those ideas are a bit overkill in your situation ^^
    But, still... Why not study them a bit, just in case ? ;-)


    And what about Kohana?

    Your initial question was about optimizing an application that uses Kohana... Well, I've posted some ideas that are true for any PHP application... Which means they are true for Kohana too ;-)
    (Even if not specific to it ^^)

    I said: use cache; Kohana seems to support some caching stuff (You talked about it yourself, so nothing new here...)
    If there is anything that can be done quickly, try it ;-)

    I also said you shouldn't do anything that's not necessary; is there anything enabled by default in Kohana that you don't need?
    Browsing the net, it seems there is at least something about XSS filtering; do you need that?

    Still, here's a couple of links that might be useful:


    Conclusion?

    And, to conclude, a simple thought:

    • How much will it cost your company to pay you 5 days? -- considering it is a reasonable amount of time to do some great optimizations
    • How much will it cost your company to buy (pay for?) a second server, and its maintenance?
    • What if you have to scale larger?
      • How much will it cost to spend 10 days? more? optimizing every possible bit of your application?
      • And how much for a couple more servers?

    I'm not saying you shouldn't optimize: you definitely should!
    But go for "quick" optimizations that will get you big rewards first: using some opcode cache might help you get between 10 and 50 percent off your server's CPU-load... And it takes only a couple of minutes to set up ;-) On the other side, spending 3 days for 2 percent...

    Oh, and, btw: before doing anything: put some monitoring stuff in place, so you know what improvements have been made, and how!
    Without monitoring, you will have no idea of the effect of what you did... Not even if it's a real optimization or not!

    For instance, you could use something like RRDtool + cacti.
    And showing your boss some nice graphics with a 40% CPU-load drop is always great ;-)


    Anyway, and to really conclude: have fun!
    (Yes, optimizing is fun!)
    (Ergh, I didn't think I would write that much... Hope at least some parts of this are useful... And I should remember this answer: might be useful some other times...)

    这篇关于优化基于Kohana的网站以提高速度和可扩展性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆