如何开始使用Web缓存,CDN和代理服务器? [英] How to get started with web caching, CDNs, and proxy servers?

查看:85
本文介绍了如何开始使用Web缓存,CDN和代理服务器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名新手程序员,我打算(自然)希望创建一个能够带来大量流量的创业公司。我将django项目托管在Amazon EC2上的dotcloud上。我有一些流媒体(不过是HTTP,不是rmtp),所以dotcloud的人推荐我使用CDN。我还使用Amazon S3进行存储,因此决定将Amazon CloudFront用作CDN。

I'm newbie programmer building a startup that I (naturally) hope will create a large amount of traffic. I am hosting my django project on dotcloud, which is on Amazon EC2. I have some streaming media (Http though, not rmtp) so the dotcloud guys recommended I go with a CDN. I am also using Amazon S3 for storage and so decided to go with Amazon CloudFront as my CDN.

现在是时候该把我的注意力转向缓存了,我迷路了。我对这个概念完全陌生。我的全部知识都来自我刚刚阅读的教程( http://www.mnot.net/cache_docs/ )和一个令人费解的周末,请教Google。最令人不安的是,我什至不确定我需要为自己的网站做什么。

The time has come where I need to turn my attention to caching and I am lost and confused. I am completely new to the concept. The entire extent of my knowledge comes from a tutorial I just read (http://www.mnot.net/cache_docs/) and a confusing weekend spent consulting google. Most troubling of all is that I am not even sure what I need to do for my site.


  1. CDN和代理服务器有什么区别?

  1. What is the difference between a CDN and a proxy server?

是否可以使用缓存服务(例如memcached,redis),CDN(CloudFront)和代理服务器(squid)?

Is it possible I might want to use a caching service (e.g. memcached, redis), a CDN (CloudFront), AND a proxy server (squid)?

我们的站点是数据库驱动的,并且会动态生成特定于用户位置的列表。可以缓存这样的网站吗? (列表本身可以通过AJAX进行过滤,因此URL可能保持不变,而产生的结果却大不相同。例如,example.com/some_url/可能会生成40个对象的列表,但页面上仅出现10个对象。过滤器,则用户可能仍然有10个不同对象,而它们仍位于/ some_url /)

Our site is DB driven and produces dynamically generated lists specific to user locations. Can such a site be cached? (The lists themselves are filterable via AJAX, so the URL might remain the same while producing largely different results. For instance, example.com/some_url/ might generate a list of 40 objects, but only 10 appearing on the page. By clicking on a filter, the user could end up with 10 different objects while still at /some_url/)

什么是最佳做法一个高流量,内容丰富的网站?

What are the best practices for a high traffic, rich content site?

我如何了解这一点?我似乎无处不在的一切似乎都是理所当然的,而我只是自己基础的一部分而已。

How can I learn about this? Everywhere I look seems to take for granted some basics that I just don't have as a part of my own foundation yet.

我不确定我在问正确的问题。只是感到很失落。现在,我已经建立了整个网站的95%,并认为我只是在整理细节,但是缓存似乎是另一项重大任务。任何指导/建议/鼓励都将不胜感激!

I'm not certain I'm asking the right questions. Just feeling very lost. I've now built 95% of my entire site and thought I was just ironing out the details but caching seems like another major undertaking. Any guidance/advice/encouragement would be much appreciated!

推荐答案

好吧,让我们开始缓存...

Right then let's start with caching...

缓存是关于临时存储某些内容,因此您不必每次都执行更昂贵的操作来获取它。

Caching is about storing something on a temporary basis so that you don't have to perform a more expensive operation to retrieve it every time.

HTTP缓存与保存服务器往返行程有关,如果仅使用默认行为,浏览器将要求服务器如果您使用的是最新版本,请向我发送此资源的副本

HTTP caching is about saving round-trips to servers, if you just use default behaviour a browser will ask the server to "send me a copy of this resource if you have a more recent version"

如果将expires header设置为将来的时间,则浏览器不会询问此问题,因为它知道它可以使用获取的资源的副本。

If you set expires header to a future time, then the browser doesn't ask this question as it knows it can use the copy of the resource it's got.

在此级别进行缓存可以改善最终用户的体验并节省带宽。

Caching at this level improves the end-users experience and saves you bandwidth.

从您的简短描述中,HTTP缓存可以帮助处理较小的静态文件(读了bookofspeed.com的ch3)

From your brief description HTTP caching could help with the smaller static files (have a read of ch3 of bookofspeed.com)

DB缓存是因为memcached(和redis)用于减少例如,通过将结果保存在操作上,然后从缓存中提供结果,而不是重复数据库操作来对数据库进行负载)

DB caching as memcached (and redis) are used for are about reducing the load on databases (for example) by saving the results on an operation and then serving them from the cache rather than repeating the database operation)

在您的情况下,您将缓存

In your situation you would cache at the data retrieval layer based on the request parameters (and perhaps ensure the HTTP responses to the client aren't cached).

CDN vs Proxy Servers ...

CDNs vs Proxy Servers...

CDN与代理服务器...

These are really different beasts - CDNs are about keeping content close to your visitors so reducing latency - if you're serving large files it also puts them on a network optimised for it instead of your servers but there's a £££ price attached to doing that. Some CDNs e.g. cloud front have a proxy like behaviour where they go back to your origin server if they don't have the file the visitor wants.

这些是完全不同的野兽-CDN旨在使内容与您的访问者保持亲密关系,从而减少延迟-如果您要提供大文件,它还会将它们放在针对其进行了优化的网络上您的服务器,但这样做要付出一定的代价。一些CDN,例如云前端具有类似代理的行为,如果没有访问者想要的文件,它们会返回到原始服务器。

Proxy servers are literally servers that sit between your server and the end visitor - they might be part of your server farm (reverse proxy) the ISP's network or the visitor's network.

代理服务器实际上是位于两者之间的服务器。您的服务器和最终访问者-它们可能是ISP网络或访问者网络的服务器场(反向代理)的一部分。

A reverse proxy is essentially offloading the work of communication with the end-visitor from your servers e.g. if they have a slow connection they'll tie up a server generating a page for longer. Reverse proxies can also sit infront of multiple servers - either all doing the same thing or different things and the proxy presents a single address to the outside world. Squid is one proxy you might use but Varnish is very popular ATM too.

反向代理实质上是在减轻工作负担与您的服务器的最终访客进行通信的方式,例如如果连接速度较慢,则会捆绑服务器以生成更长的页面。反向代理也可以位于多个服务器的前面-要么都做相同的事情,要么都做不同的事情,并且代理向外界提供一个地址。鱿鱼是您可能会使用的一种代理,但Varnish也是非常流行的ATM。

Normal proxies just act as caches for those visitors who come through them e.g. a company may have a caching proxy server at their internet gateway so that the first person visiting an external site gets to retrieve a file and subsequent visitors get it form the proxy - they get a faster experience and the company reduces their bandwidth consumption.

普通代理仅充当访问它们的访问者的缓存,例如公司可能会在其Internet网关上有一个缓存代理服务器,这样,访问外部站点的第一个人就可以检索文件,随后的访问者从代理那里获取文件-他们可以获得更快的体验,并且公司减少了带宽消耗。 p>

我想您现在没有人流量大的站点,所以您面临的挑战是了解在哪里花精力,即什么时候需要优化。

I'm guessing you don't have a high traffic site at the moment so your challenge is to understand where to spend your effort i.e. what needs optimising when.

我的第一个建议是加入一些真实的用户监视(RUM),即使它是使用Boomerang.js或Pion构建自己的。另外,请查看诸如Cacti / Munin / CollectD之类的监视工具,以便您了解服务器上的负载。

My first recommendation would be to get some real user monitoring (RUM) in, even if it's building your own using Boomerang.js or Pion. Also look at monitoring tools such as Cacti/Munin/CollectD so you can understand the load on your servers.

了解用户体验是确定需要解决的关键乐观。

Understanding your users experience is key to working out where you need to optimise.

这篇关于如何开始使用Web缓存,CDN和代理服务器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆