优化基于 Kohana 的网站的速度和可扩展性 [英] Optimizing Kohana-based Websites for Speed and Scalability

查看:24
本文介绍了优化基于 Kohana 的网站的速度和可扩展性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我与 Kohana 建立的一个网站昨天因大量流量而受到猛烈抨击,这让我退后一步并评估了一些设计.我很好奇有哪些标准技术可以优化基于 Kohana 的应用程序?

我也对基准测试感兴趣.我是否需要为每个控制器方法设置 Benchmark::start()Benchmark::stop() 以查看所有页面的执行时间,或者我是能否在全球范围内快速应用基准测试?

我将在未来更多地使用缓存库,但我愿意接受更多建议,因为我确信我可以做很多我目前不知道的事情.

解决方案

我在这个回答中要说的不是 Kohana 特有的,可能适用于很多 PHP 项目.

在谈论性能、可扩展性、PHP 时,我想到了以下几点……
我在几个项目中使用了很多这样的想法——它们很有帮助;所以他们也可以在这里提供帮助.


首先,说到表演,有很多方面/问题需要考虑:

  • 服务器配置(Apache、PHP、MySQL、其他可能的守护进程和系统);你可能会在
    (来源:
    (来源:pascal-martin.fr)

    (顺便说一句,如果我没记错的话,第二个屏幕截图中显示的调用图通常是 WinCacheGrind 和 Webgrind 都无法做到的 ^^)


    (感谢@Mikushi 的评论) 另一种我用得不多的可能性是 xhprof 扩展:它还有助于分析,可以生成调用图——但比 Xdebug 轻,这意味着你应该能够将它安装在生产服务器上.>

    您应该可以单独使用它XHGui,它将帮助数据可视化.


    在 SQL 方面:

    既然我们已经谈到了 PHP,请注意您的瓶颈很可能不是 PHP 方面的事情,而是数据库方面的...>

    这里至少有两三件事:

    • 您应该确定:
      • 您的应用程序执行最频繁的查询是什么
      • 这些是否经过优化(主要使用正确的索引?),使用EXPLAIN 指令,如果你使用的是 MySQL
      • 是否可以缓存其中一些查询(请参阅我之前所说的内容)
    • 你的 MySQL 配置好了吗?我对此了解不多,但有一些配置选项可能会产生一些影响.

    不过,最重要的两件事是:

    • 如果不需要,请不要访问数据库:尽可能多地缓存
    • 当你必须去数据库时,使用高效的查询:使用索引;和个人资料!


    现在怎么办?

    如果你还在阅读,还有什么可以优化的?

    嗯,还有改进的空间......一些面向架构的想法可能是:

    • 切换到 n 层架构:
      • 将 MySQL 放在另一台服务器上(2 层:一个用于 PHP;另一个用于 MySQL)
      • 使用多个 PHP 服务器(并在这些服务器之间对用户进行负载平衡)
      • 使用另一台机器处理静态文件,使用更轻的网络服务器,例如:
        • lighttpd
        • nginx——顺便说一句,这个越来越流行了.
      • 为 MySQL 使用多个服务器,为 PHP 使用多个服务器,并在它们之前使用多个反向代理
      • 当然:在任何有任意空闲 RAM 的服务器上安装 memcached 守护进程,并使用它们作为缓存尽你所能/有意义.
    • 使用比 Apache 更高效"的东西?
      • 我越来越多地听说 nginx,这应该很棒当涉及到 PHP 和高容量网站时;我自己从未使用过它,但您可能会在网上找到一些关于它的有趣文章;

    好吧,也许其中一些想法在您的情况下有点矫枉过正^^
    但是,仍然......为什么不研究一下,以防万一?;-)


    Kohana 怎么样?

    您最初的问题是关于优化使用 Kohana 的应用程序...好吧,我已经发布了一些适用于任何 PHP 应用程序的想法...这意味着它们也适用于 Kohana;-)
    (即使不是特定于它^^)

    我说:使用缓存;Kohana 似乎支持一些缓存内容 (你自己讨论过,所以没什么新鲜的这里...)
    如果有什么可以快速完成的,请尝试;-)

    我还说过你不应该做任何不必要的事情;Kohana 中是否默认启用了您不需要的任何功能?
    浏览网络,似乎至少有一些关于XSS过滤的东西;你需要那个吗?

    不过,这里有几个可能有用的链接:


    结论?

    最后,一个简单的想法:

    • 贵公司支付给您 5 天的费用是多少?-- 考虑到进行一些出色的优化是合理的时间
    • 贵公司购买(支付?)第二台服务器及其维护需要多少费用?
    • 如果您必须扩大规模怎么办?
      • 花费 10 天的费用是多少?更多的?优化应用程序的每一个可能的部分?
      • 多几台服务器要多少钱?

    我并不是说你不应该优化:你绝对应该!
    但是首先要进行快速"优化,这会给您带来丰厚的回报:使用一些操作码缓存可能会帮助您将服务器的 CPU 负载降低 10% 到 50% ......而且它需要只需几分钟即可设置 ;-) 另一方面,花 3 天时间获得 2%...

    哦,顺便说一句:在做任何事情之前:放置一些监控的东西,这样你就知道做了哪些改进,以及如何改进!
    如果没有监控,您将不知道您所做的事情的效果......即使它是否是真正的优化也不知道!

    例如,您可以使用类似 RRDtool + <强>仙人掌.
    向你的老板展示一些 CPU 负载下降 40% 的漂亮图形总是很棒;-)


    无论如何,最后得出结论:玩得开心!
    (是的,优化很有趣!)
    (呃,我不认为我会写那么多......希望至少其中的某些部分是有用的......我应该记住这个答案:可能在其他时候有用......)

    A site I built with Kohana was slammed with an enormous amount of traffic yesterday, causing me to take a step back and evaluate some of the design. I'm curious what are some standard techniques for optimizing Kohana-based applications?

    I'm interested in benchmarking as well. Do I need to setup Benchmark::start() and Benchmark::stop() for each controller-method in order to see execution times for all pages, or am I able to apply benchmarking globally and quickly?

    I will be using the Cache-library more in time to come, but I am open to more suggestions as I'm sure there's a lot I can do that I'm simply not aware of at the moment.

    解决方案

    What I will say in this answer is not specific to Kohana, and can probably apply to lots of PHP projects.

    Here are some points that come to my mind when talking about performance, scalability, PHP, ...
    I've used many of those ideas while working on several projects -- and they helped; so they could probably help here too.


    First of all, when it comes to performances, there are many aspects/questions that are to consider:

    • configuration of the server (both Apache, PHP, MySQL, other possible daemons, and system); you might get more help about that on ServerFault, I suppose,
    • PHP code,
    • Database queries,
    • Using or not your webserver?
    • Can you use any kind of caching mechanism? Or do you need always more that up to date data on the website?


    Using a reverse proxy

    The first thing that could be really useful is using a reverse proxy, like varnish, in front of your webserver: let it cache as many things as possible, so only requests that really need PHP/MySQL calculations (and, of course, some other requests, when they are not in the cache of the proxy) make it to Apache/PHP/MySQL.

    • First of all, your CSS/Javascript/Images -- well, everything that is static -- probably don't need to be always served by Apache
      • So, you can have the reverse proxy cache all those.
      • Serving those static files is no big deal for Apache, but the less it has to work for those, the more it will be able to do with PHP.
      • Remember: Apache can only server a finite, limited, number of requests at a time.
    • Then, have the reverse proxy serve as many PHP-pages as possible from cache: there are probably some pages that don't change that often, and could be served from cache. Instead of using some PHP-based cache, why not let another, lighter, server serve those (and fetch them from the PHP server from time to time, so they are always almost up to date)?
      • For instance, if you have some RSS feeds (We generally tend to forget those, when trying to optimize for performances) that are requested very often, having them in cache for a couple of minutes could save hundreds/thousands of request to Apache+PHP+MySQL!
      • Same for the most visited pages of your site, if they don't change for at least a couple of minutes (example: homepage?), then, no need to waste CPU re-generating them each time a user requests them.
    • Maybe there is a difference between pages served for anonymous users (the same page for all anonymous users) and pages served for identified users ("Hello Mr X, you have new messages", for instance)?
      • If so, you can probably configure the reverse proxy to cache the page that is served for anonymous users (based on a cookie, like the session cookie, typically)
      • It'll mean that Apache+PHP has less to deal with: only identified users -- which might be only a small part of your users.

    About using a reverse-proxy as cache, for a PHP application, you can, for instance, take a look at Benchmark Results Show 400%-700% Increase In Server Capabilities with APC and Squid Cache.
    (Yep, they are using Squid, and I was talking about varnish -- that's just another possibility ^^ Varnish being more recent, but more dedicated to caching)

    If you do that well enough, and manage to stop re-generating too many pages again and again, maybe you won't even have to optimize any of your code ;-)
    At least, maybe not in any kind of rush... And it's always better to perform optimizations when you are not under too much presure...


    As a sidenote: you are saying in the OP:

    A site I built with Kohana was slammed with an enormous amount of traffic yesterday,

    This is the kind of sudden situation where a reverse-proxy can literally save the day, if your website can deal with not being up to date by the second:

    • install it, configure it, let it always -- every normal day -- run:
      • Configure it to not keep PHP pages in cache; or only for a short duration; this way, you always have up to date data displayed
    • And, the day you take a slashdot or digg effect:
      • Configure the reverse proxy to keep PHP pages in cache; or for a longer period of time; maybe your pages will not be up to date by the second, but it will allow your website to survive the digg-effect!

    About that, How can I detect and survive being "Slashdotted"? might be an interesting read.


    On the PHP side of things:

    First of all: are you using a recent version of PHP? There are regularly improvements in speed, with new versions ;-)
    For instance, take a look at Benchmark of PHP Branches 3.0 through 5.3-CVS.

    Note that performances is quite a good reason to use PHP 5.3 (I've made some benchmarks (in French), and results are great)...
    Another pretty good reason being, of course, that PHP 5.2 has reached its end of life, and is not maintained anymore!

    Are you using any opcode cache?

    • I'm thinking about APC - Alternative PHP Cache, for instance (pecl, manual), which is the solution I've seen used the most -- and that is used on all servers on which I've worked.
    • It can really lower the CPU-load of a server a lot, in some cases (I've seen CPU-load on some servers go from 80% to 40%, just by installing APC and activating it's opcode-cache functionality!)
    • Basically, execution of a PHP script goes in two steps:
      • Compilation of the PHP source-code to opcodes (kind of an equivalent of JAVA's bytecode)
      • Execution of those opcodes
      • APC keeps those in memory, so there is less work to be done each time a PHP script/file is executed: only fetch the opcodes from RAM, and execute them.
    • You might need to take a look at APC's configuration options, by the way
      • there are quite a few of those, and some can have a great impact on both speed / CPU-load / ease of use for you
      • For instance, disabling [apc.stat](https://php.net/manual/en/apc.configuration.php#ini.apc.stat) can be good for system-load; but it means modifications made to PHP files won't be take into account unless you flush the whole opcode-cache; about that, for more details, see for instance To stat() Or Not To stat()?


    Using cache for data

    As much as possible, it is better to avoid doing the same thing over and over again.

    The main thing I'm thinking about is, of course, SQL Queries: many of your pages probably do the same queries, and the results of some of those is probably almost always the same... Which means lots of "useless" queries made to the database, which has to spend time serving the same data over and over again.
    Of course, this is true for other stuff, like Web Services calls, fetching information from other websites, heavy calculations, ...

    It might be very interesting for you to identify:

    • Which queries are run lots of times, always returning the same data
    • Which other (heavy) calculations are done lots of time, always returning the same result

    And store these data/results in some kind of cache, so they are easier to get -- faster -- and you don't have to go to your SQL server for "nothing".

    Great caching mechanisms are, for instance:

    • APC: in addition to the opcode-cache I talked about earlier, it allows you to store data in memory,
    • And/or memcached (see also), which is very useful if you literally have lots of data and/or are using multiple servers, as it is distributed.
    • of course, you can think about files; and probably many other ideas.

    I'm pretty sure your framework comes with some cache-related stuff; you probably already know that, as you said "I will be using the Cache-library more in time to come" in the OP ;-)


    Profiling

    Now, a nice thing to do would be to use the Xdebug extension to profile your application: it often allows to find a couple of weak-spots quite easily -- at least, if there is any function that takes lots of time.

    Configured properly, it will generate profiling files that can be analysed with some graphic tools, such as:

    • KCachegrind: my favorite, but works only on Linux/KDE
    • Wincachegrind for windows; it does a bit less stuff than KCacheGrind, unfortunately -- it doesn't display callgraphs, typically.
    • Webgrind which runs on a PHP webserver, so works anywhere -- but probably has less features.

    For instance, here are a couple screenshots of KCacheGrind:


    (source: pascal-martin.fr)

    (source: pascal-martin.fr)

    (BTW, the callgraph presented on the second screenshot is typically something neither WinCacheGrind nor Webgrind can do, if I remember correctly ^^ )


    (Thanks @Mikushi for the comment) Another possibility that I haven't used much is the the xhprof extension : it also helps with profiling, can generate callgraphs -- but is lighter than Xdebug, which mean you should be able to install it on a production server.

    You should be able to use it alonside XHGui, which will help for the visualisation of data.


    On the SQL side of things:

    Now that we've spoken a bit about PHP, note that it is more than possible that your bottleneck isn't the PHP-side of things, but the database one...

    At least two or three things, here:

    • You should determine:
      • What are the most frequent queries your application is doing
      • Whether those are optimized (using the right indexes, mainly?), using the EXPLAIN instruction, if you are using MySQL
      • whether you could cache some of these queries (see what I said earlier)
    • Is your MySQL well configured? I don't know much about that, but there are some configuration options that might have some impact.

    Still, the two most important things are:

    • Don't go to the DB if you don't need to: cache as much as you can!
    • When you have to go to the DB, use efficient queries: use indexes; and profile!


    And what now?

    If you are still reading, what else could be optimized?

    Well, there is still room for improvements... A couple of architecture-oriented ideas might be:

    • Switch to an n-tier architecture:
      • Put MySQL on another server (2-tier: one for PHP; the other for MySQL)
      • Use several PHP servers (and load-balance the users between those)
      • Use another machines for static files, with a lighter webserver, like:
        • lighttpd
        • or nginx -- this one is becoming more and more popular, btw.
      • Use several servers for MySQL, several servers for PHP, and several reverse-proxies in front of those
      • Of course: install memcached daemons on whatever server has any amount of free RAM, and use them to cache as much as you can / makes sense.
    • Use something "more efficient" that Apache?
      • I hear more and more often about nginx, which is supposed to be great when it comes to PHP and high-volume websites; I've never used it myself, but you might find some interesting articles about it on the net;

    Well, maybe some of those ideas are a bit overkill in your situation ^^
    But, still... Why not study them a bit, just in case ? ;-)


    And what about Kohana?

    Your initial question was about optimizing an application that uses Kohana... Well, I've posted some ideas that are true for any PHP application... Which means they are true for Kohana too ;-)
    (Even if not specific to it ^^)

    I said: use cache; Kohana seems to support some caching stuff (You talked about it yourself, so nothing new here...)
    If there is anything that can be done quickly, try it ;-)

    I also said you shouldn't do anything that's not necessary; is there anything enabled by default in Kohana that you don't need?
    Browsing the net, it seems there is at least something about XSS filtering; do you need that?

    Still, here's a couple of links that might be useful:


    Conclusion?

    And, to conclude, a simple thought:

    • How much will it cost your company to pay you 5 days? -- considering it is a reasonable amount of time to do some great optimizations
    • How much will it cost your company to buy (pay for?) a second server, and its maintenance?
    • What if you have to scale larger?
      • How much will it cost to spend 10 days? more? optimizing every possible bit of your application?
      • And how much for a couple more servers?

    I'm not saying you shouldn't optimize: you definitely should!
    But go for "quick" optimizations that will get you big rewards first: using some opcode cache might help you get between 10 and 50 percent off your server's CPU-load... And it takes only a couple of minutes to set up ;-) On the other side, spending 3 days for 2 percent...

    Oh, and, btw: before doing anything: put some monitoring stuff in place, so you know what improvements have been made, and how!
    Without monitoring, you will have no idea of the effect of what you did... Not even if it's a real optimization or not!

    For instance, you could use something like RRDtool + cacti.
    And showing your boss some nice graphics with a 40% CPU-load drop is always great ;-)


    Anyway, and to really conclude: have fun!
    (Yes, optimizing is fun!)
    (Ergh, I didn't think I would write that much... Hope at least some parts of this are useful... And I should remember this answer: might be useful some other times...)

    这篇关于优化基于 Kohana 的网站的速度和可扩展性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆