了解后端文件种子以提供快速的客户端下载 [英] Understanding of back end file seeding to provide fast client downloads

查看:154
本文介绍了了解后端文件种子以提供快速的客户端下载的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的项目的主题是实现一个分布式服务器,它提供几个客户端几个文件下载。服务器托管了几个文件,我们希望服务器应该实现一些最佳算法,以便客户端快速地从客户端下载数据。



我的项目实施思路: / p>

像客户端一般使用一些下载管理器下载文件,类似地,必须存在一些服务器端管理器/代码/算法,以快速上传/播种文件以让客户端下载文件。 除非选择要下载的文件,否则不得有客户端的任何操作!



如何编写此类服务器的代码在后端,类似于前端客户端的基于多线程的下载管理器?



如果服务器种子/如何使用文件到客户端客户端只将该路径作为String发送到Java中的服务器进行下载?



或者,如果我缺少某些东西/我的想法完全错误,请告诉我替代过程/算法,我必须在服务器端实现。请记住,提出此问题的整个目的是后端服务器种子算法或等效算法/方法。

解决方案

,这个服务器的你有一个很好的互联网连接与广泛的上游。如果是这种情况,那么当只有少数客户端下载少量文件时的限制因素是这些客户端的带宽。所以你最多可以与客户的下游带宽一样快。因此,只需使用现成的HTTP服务器库来提供下载即可。



您的后端实作真正重要,并且能够改善下载效能的时候,许多使用者都连接到您的伺服器并下载许多档案。首先有以下几点要考虑:




  • TCP有一个启动时间。当您首次打开连接时,下载速率会慢慢开始增加,直到达到最大值。


  • 一次下载多个文件(在客户端上) )是不合理的,当带宽是限制因素时,因为客户端必须启动许多TCP连接,数据将被分段,当写入磁盘或(当预先分配时)磁盘将很忙,而在扇区之间跳跃


  • 您的服务器通常应使用非阻塞IO库(例如 java.nio ),并避免为每个连接创建线程,因为这会导致 thrashing ,这会再次降低服务器的性能。



$ b b

如果您有大量的客户端同时从您的服务器下载,您可能会遇到的限制是:




  • 您的供应商的上游限制


  • 硬盘的读取速度(根据我的了解,SSD的速度约为500MB / s) / p>




您的服务器可以尝试将最常见的请求文件保存在其内存中,并从其中提供内容=http://en.wikipedia.org/wiki/DDR3_SDRAM =nofollow> DDR3 RAM达到17GB / s的速度)。我怀疑你在服务器上只有尽可能少的文件,可以将它们全部缓存在服务器的RAM中。



因此,主要的工程任务在于聪明地选择哪些内容应该被缓存,哪些不是。这可以在优先级基础上通过向某些文件分配较高优先级或通过编码在接下来几分钟内下载单个文件的概率的度量来完成。



在这种情况下,您可以将下载服务器的限制推送到某个点从中可以通过将您的文件分发或复制到多个服务器上来实现唯一的改进。



如果你要走向这样一个方向,同时服务数百万客户必须是可能的,你应该考虑从CDN购买这样的服务。他们专注于快速交付,在大多数AS中有许多上游服务器,因此每个客户都可以从区域CDN服务器下载他的文件。






我知道,我没有给出任何算法或代码示例,但我不打算完全回答这个问题。我只是为了给你一些重要的指导方针和思想的话题。我希望,你可以至少使用这些想法为您的项目。


The theme of my project is to implement a distributed server which provides several clients several files to download. The server is hosting several files and we want that the server should implement some best algorithms to quickly let the clients download data from it.

My idea of implementation of project:

Like the client generally downloads the file using some download managers, similarly there must exist some server side managers/codes/algorithms which upload/seed the file quickly to let client download the file. There must not be any action of client except the selection of the file to be downloaded!

How should I write the code for such a server on the back end, analogous to multi-threading based downloaded managers for clients on the front-end?

How should server seed/make avail the file to the client if the client only sends the path as a String to the server in Java for downloading?

Or, if I am missing something/my idea is totally wrong, please enlighten me with an alternative process/algorithm which I must implement on the server side. Please remember that the whole purpose of asking this question is the back end server seeding algorithm OR equivalent algorithms/methods.

解决方案

I assume, this server of yours has a good internet connection with a broad upstream. If that is the case then the limiting factor when only few clients are downloading few files is the bandwith of these clients. So you will at most get as fast as the downstream bandwith of your clients. So simply taking an off-the-shelf HTTP server library to serve the downloads should be sufficient.

Where your backend implementation really matters and is able to improve download performance is then many users are connecting to your server and downloading many files. First off there are following points to consider:

  • TCP has a startup-time. When you first open an connection, the download rate slowly starts to increase until it hits the maximum. To minimize this time, when downloading multiple files the connection opened for one file download should be reused for the next file.

  • Downloading many files at once(on clientside) is not reasonable when bandwidth is the limiting factor, because the client has to start up many TCP connections and the data will be either fragmented, when written to Disk, or (when allocating beforehand) the disk will be pretty busy while jumping between sectors.

  • Your server should generally use a non-blocking IO library (eg. java.nio) and refrain from creating a thread per incomming connection since this leads to thrashing which again decreases your server's performance drastically.

If you have a really big amount of clients simultaneously downloading from your server, the limit you will probably hit will be either:

  • The upstream limit of your provider

  • The read speed of your Harddrive (SSD have ~ 500MB/s as far as I'm informed)

Your server can try to hold the most commonly requested files in his memory and serve the content from there (DDR3 RAM reaches speeds of 17GB/s). I doubt that you have only as few files on your server that you could cache them all in your server's RAM.

So the main engineering task lays in the clever selection of which content should be cached and which not. This could be done on a priority base by assigning higher priorities to certain files or by a metric which encodes the probability of a single file to be downloaded in the next few minutes. Or simply the files which are downloaded by the most clients at this point of time.

With such considerations you are able to push the limits of your download server until a certain point from which the only improvement can be achieved by distributing or replicating your files onto many servers.

If you are going into such a direction where serving millions of clients simultaneously must be possible, you should consider buying such a service from CDNs. They are specialized in fast delivery and have many upstream server in most ASes so that every client can download his files from the regional CDN server.


I know, I haven't given any algorithm or code examples, but I didn't intend to answer this question completely. I just wnated to give you some important guidelines and thoughts to that topic. I hope, you can at least use some of these thoughts for your project.

这篇关于了解后端文件种子以提供快速的客户端下载的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆