如何下载更快? [英] How to download faster?

查看:84
本文介绍了如何下载更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

将网页源下载到备忘组件中最快的方法是什么?我使用Indy和HttpCli组件.

What is the fastest way to download webpage source into a memo component? I use Indy and HttpCli components.

问题是我有一个包含100多个站点的列表框,我的程序将源下载到备忘录中并解析该源中的mp3文件.它类似于Google音乐搜索程序;它使用Google查询来简化Google搜索.

The problem is that I have a listbox filled with more than 100 sites, my program downloads source to a memo and parses that source for mp3 files. It is something like a Google music search program; it uses Google queries to make Google search easier.

我开始阅读有关导致问题的线程的信息:我可以在具有解析功能的线程中创建IdHttp实例,并告诉它解析列表框中一半的站点吗?

I started reading about threads which lead to my question: Can I create a IdHttp instance in a thread with parsing function and tell it to parse half of the sites in the listbox?

因此,基本上,当用户单击解析时,主线程应该这样做:

So basically when a user clicks parse, the main thread should do:

for i := 0 to listbox1.items.count div 2 do
    get and parse

,另一个线程应该做:

for i := form1.listbox1.items.count div 2 to form1.listbox1.items.count - 1 do
    get and parse.

,因此他们会将解析的内容同时添加到form1.listbox2.还是在主线程中启动两个IdHttp实例更容易?一个用于网站的前一半,另一个用于网站?

, so they would add parsed content to form1.listbox2 in the same time. Or is it maybe easier to start two IdHttp instances in the main thread; one for first half of sites and other for second?

为此:我应该使用Indy还是Synapse?

For this: should I use Indy or Synapse?

推荐答案

我将创建一个可以读取单个URL并处理其内容的线程.然后,您可以决定要同时激发多少个线程.您的计算机将允许大量连接,因此,如果这100个站点具有不同的主机名,则同时运行10或20个主机没有问题.太多就太过分了,但是太少则浪费了处理器时间.

I would create a thread that can read a single url and process its content. You can then decide how many of those threads you want to fire at the same time. Your computer will allow quite a number of connections, so if those 100 sites have different hostnames, it is not a problem to run 10 or 20 at the same time. Too much is overkill, but too little is a waste of processor time.

您可以通过使用单独的线程进行下载和处理来进一步调整此过程,从而使多个线程可以不断下载内容.下载不是非常耗费处理器资源.它基本上是在等待响应,因此您可以轻松拥有相对大量的下载线程,而其他几个辅助线程可以从结果池中获取项目并进行处理.
但是,将下载和处理分开进行会使其更加复杂,而且我认为您还没有迎接这一挑战.

You can tweak this process even further by having separate threads for downloading and processing, so that you can have a number of threads constantly downloading content. Downloading is not very processor intensive. It is basically waiting for a response, so you can easily have a relatively large number of download threads, while a couple of other worker threads can grab items from the pool of results and process them.
But splitting downloading and processing will make it a little bit more complex, and I don't think you're up to that challenge yet.

由于当前,您还有其他一些问题.最初,并没有在线程中使用VCL组件.如果需要线程列表框中的信息,则需要在线程中使用同步"对主线程进行安全"调用,或者必须在启动线程之前传递所需的信息.后者效率更高,因为使用Synchronize执行的代码实际上在主线程中运行,从而使多线程效率降低.

Because currently, you got some other problems. At first, it is not done to use VCL components in a thread. If you need information from a listbox in a thread, you will either need to use Synchronize in the thread to make a 'safe' call to the main thread, or you will have to pass the information needed before you start the thread. The latter is more efficient, because code executed using Synchronize actually runs in the main thread, making your multi-threading less efficient.

但是实际上我的注意力吸引到了第一行"将网页源代码下载到备忘录组件中".不要那样做!不要将这些结果加载到备忘录中进行处理.自动处理最好在视觉控制之外的内存中完成.使用字符串,流甚至字符串列表来处理文本比使用备忘录要快得多.
字符串列表也有一些开销,但是它使用相同的索引线结构(TMemoStrings,这是Memo的Lines属性,而TStringList都具有相同的祖先),因此,如果您使用的代码这样,将其转换为TStringList非常容易.

But my attention actually was drawn to the first line, "download webpage source into memo component". Don't do that! Don't load those results in a memo for processing. Automatic processing can best be done in memory, outside of visual controls. Using strings, streams, or even stringlists for processing a text is way faster than using a memo.
A stringlist has some overhead as well, but it uses the same construction of indexing the lines (TMemoStrings, which is the Lines property of a Memo, and TStringList both have the same ancestor), so if you got code that makes use of this, it will be quite easy to convert it to TStringList.

这篇关于如何下载更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆