说网络爬虫是I / O绑定而不是CPU绑定是什么意思? [英] What does it mean to say a web crawler is I/O bound and not CPU bound?

查看:249
本文介绍了说网络爬虫是I / O绑定而不是CPU绑定是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在S / O的一些答案中已经看到了这一点,其中指出编程语言对于爬虫来说并不重要,因此C ++对于Python来说太过分了。有人可以用非专业人的术语来解释这一点,这样就没有含糊不清的含义吗?此处澄清基本假设也很受欢迎。

I've seen this in some answers on S/O where the point is made that the programming language doesn't matter as much for a crawler and so C++ is overkill vs say Python. Can someone please explain this in layman's terms so that there's no ambiguity about what is implied? Clarification of the underlying assumption here is also appreciated.

谢谢

推荐答案

这意味着I / O是这里的瓶颈。走出网络检索页面(I / O)的行为比分析页面(CPU)要慢。

It means that I/O is the bottleneck here. The act of going out to the net to retrieve a page (I/O) is slower than analysing the page (CPU).

因此,将CPU位置调整十倍速度越快,对整体时间的影响就越小。另一方面,将I / O速度加倍具有非常有益的效果,直到CPU开始成为瓶颈。

So, making the CPU bit ten times faster will have little effect on the overall time taken. On the other hand, doubling the I/O speed will have a very beneficial effect, right up to the point where CPU starts being the bottleneck.

这篇关于说网络爬虫是I / O绑定而不是CPU绑定是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆