多台服务器上的 django-celery 基础设施,代理是 redis [英] django-celery infrastructure over multiple servers, broker is redis

查看:27
本文介绍了多台服务器上的 django-celery 基础设施,代理是 redis的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我们在单个云服务器上设置了所有内容,包括:

Currently we have everything setup on single cloud server, that includes:

  • 数据库服务器
  • 阿帕奇
  • 芹菜
  • redis 作为 celery 和其他一些任务的代理

现在我们正在考虑将主要组件分解为单独的服务器,例如独立的数据库服务器、媒体文件的独立存储、负载均衡器后面的 Web 服务器.原因是不要为一台繁重的服务器付费,而是使用负载均衡器按需创建服务器,以降低成本并提高整体速度.

Now we are thinking to break apart the main components to separate servers e.g. separate database server, separate storage for media files, web servers behind load balancers. The reason is to not to pay for one heavy server and use load balancers to create servers on demand to reduce cost and improve overall speed.

我真的只对 celery 感到困惑,有没有人在负载均衡器后面的多个生产服务器上使用过 celery?任何指导将不胜感激.

I am really confused about celery only, have anyone ever used celery on multiple production servers behind load balancers? Any guidance would be appreciated.

考虑一个小用例,它目前是如何在单个服务器上完成的(令人困惑的是当我们使用多个服务器时如何完成):

Consider one small use case which is currently how it is been done on single server (confusion is that how that can be done when we use multiple servers):

  • 用户上传一个abc.pptx文件->引用存储在数据库中->存储在服务器磁盘上
  • 创建一个任务(将文档转换为 pdf)并进入 redis(代理)队列
  • 运行在同一台服务器上的celery从队列中选择任务
    • 读取文件,使用名为 docsplit
    • 的软件将其转换为 pdf
    • 在服务器磁盘上创建一个文件夹(稍后将用作静态内容)放置pdf文件及其缩略图和纯文本以及原始文件
    • User uploads a abc.pptx file->reference is stored in database->stored on server disk
    • A task (convert document to pdf) is created and goes in redis (broker) queue
    • celery which is running on same server picks the task from queue
      • Read the file, convert it to pdf using software called docsplit
      • create a folder on server disk (which will be used as static content later on) puts pdf file and its thumbnail and plain text and the original file

      考虑到上述用例,您如何设置多个可以执行相同功能的 Web 服务器?

      Considering the above use case, how can you setup up multiple web servers which can perform the same functionality?

      推荐答案

      一些可从所有协作服务器访问的共享存储将极大地简化您的处理.通过这样的设计,您可以将工作分配到更多的服务器上,而不必担心下一个处理步骤将在哪台服务器上完成.

      What will strongly simplify your processing is some shared storage, accessible from all cooperating servers. With such design, you may distribute the work among more servers without worrying on which server will be next processing step done.

      如果您可以使用某些云存储,例如 AWS S3,请使用它.

      If you can use some cloud storage, like AWS S3, use that.

      如果您的服务器也在 AWS 上运行,则无需为同一区域内的流量付费,而且传输速度非常快.

      In case you have your servers running at AWS too, you do not pay for traffic within the same region, and transfers are quite fast.

      主要优点是,您的数据可以从所有服务器以相同的存储桶/密钥名称访问,因此您不必担心谁在处理哪个文件,因为所有服务器都在 S3 上共享存储.

      Main advantage is, your data are available from all the servers under the same bucket/key name, so you do not have to bother about who is processing which file, as all have shared storage on S3.

      注意:如果您需要删除旧文件,您甚至可以在给桶上设置一些策略文件,例如删除超过 1 天或 1 周的文件.

      note: If you need to get rid of old files, you may even set up some policy file on give bucket, e.g. to delete files older than 1 day or 1 week.

      还有更多选择

      • 桑巴
      • 中央文件服务器
      • FTP
      • Google 存储(非常类似于 AWS S3)
      • Swift(来自 OpenStack)

      对于小文件,您甚至可以使用 Redis,但这种解决方案的理由非常少.

      For small files you could even use Redis, but such solutions are for good reasons rather rare.

      这篇关于多台服务器上的 django-celery 基础设施,代理是 redis的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆