哪个更有效?更多核心或更多CPU [英] Which is more Efficient? More Cores or More CPUs

查看:75
本文介绍了哪个更有效?更多核心或更多CPU的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我意识到这更多是一个硬件问题,但这与软件非常相关,尤其是在针对多线程多核/cpu环境进行编程时.

哪个更好,为什么?不管是关于效率,速度,生产率,可用性等.

1.)一台具有4个四核CPU的计算机/服务器?

2.)一台具有16个单核CPU的计算机/服务器?

请假设所有其他因素(速度,缓存,总线速度,带宽等)均相等.

修改:

我总体上对性能方面感兴趣.至于它在一个方面特别好,而在另一方面又很糟糕(或不受欢迎),那么我也想知道.

如果必须选择,我会最感兴趣的是,在I/O绑定应用程序和计算绑定应用程序方面,这是更好的选择.

解决方案

这不是一个容易回答的问题.毫不奇怪,计算机体系结构非常复杂.以下是一些准则,但即使只是简化.其中很多将取决于您的应用程序以及您正在处理的约束(业务和技术方面).

CPU具有几个(通常为2-3个) CPU上的缓存级别.一些现代CPU的管芯上也有一个存储控制器.这样可以大大提高内核之间交换内存的速度. CPU之间的内存I/O必须通过外部总线连接,这往往会变慢.

AMD/ATI芯片使用点对点协议 HyperTransport .

但是,这一切都会使总线体系结构复杂化.英特尔的Core 2 Duo/Quad系统使用共享总线.可以将其想象为以太网或有线互联网,那里只有那么多带宽需要流通,每个新参与者都从整体中获得了另一份份额.核心i7和更新的Xeon使用 QuickPath ,与HyperTransport非常相似.

以每核和其他硬件(例如母板)的成本而言,更多的内核将占用更少的空间,更少的空间,更少的功耗和更少的成本(除非您使用的是功耗非常低的CPU).

通常来说,一个CPU将是最便宜的(就硬件和软件而言).商品硬件可用于此目的.一旦转到第二个插槽,您往往不得不使用不同的芯片组,更昂贵的母板以及通常更昂贵的RAM(例如ECC完全缓冲的RAM),因此从一个CPU到两个CPU的成本很高.这是许多大型站点(包括Flickr,Google和其他站点)使用数千商用服务器的原因之一(尽管Google的服务器在某种程度上是经过定制的,以包括9V电池之类的东西,但原理是相同的.) /p>

您的修改并没有太大变化. 性能"是一个非常主观的概念.表现在什么地方?请记住,如果您的应用程序没有足够的多线程(或多进程)来利用额外的内核,那么您实际上可以通过添加更多的内核来降低性能.

受I/O限制的应用程序可能不喜欢一个应用程序.毕竟,它们是由I/O而非CPU约束的.

对于基于计算的应用程序,这完全取决于计算的性质.如果您要处理大量浮点,则可以使用GPU卸载计算(例如,使用 Nvidia CUDA ).您可以从中获得巨大的性能优势.看看用于Folding @ Home的GPU客户端.

>

简而言之,您的问题不适合特定的答案,因为该主题很复杂并且信息不足.技术架构是必须针对特定应用程序设计的.

I realize this is more of a hardware question, but this is also very relevant to software, especially when programming for mult-threaded multi-core/cpu environments.

Which is better, and why? Whether it be regarding efficiency, speed, productivity, usability, etc.

1.) A computer/server with 4 quad-core CPUs?

or

2.) A computer/server with 16 single-core CPUs?

Please assume all other factors (speed, cache, bus speeds, bandwidth, etc.) are equal.

Edit:

I'm interested in the performance aspect in general. As to if it's particularly better at one aspect and horrible (or not preferable) at another, then I'd like to know that as well.

And if I have to choose, I'd be most interested which is better in regards to I/O-bound applications, and compute-bound applications.

解决方案

That's not an easy question to answer. Computer architecture is unsurprisingly rather complicated. Below are some guidelines but even these are simplifications. A lot of this will come down to your application and what constraints you're working within (both business and technical).

CPUs have several (2-3 generally) levels of caching on the CPU. Some modern CPUs also have a memory controller on the die. That can greatly improve the speed of swapping memory between cores. Memory I/O between CPUs will have to go on an external bus, which tends to be slower.

AMD/ATI chips use HyperTransport, which is a point-to-point protocol.

Complicating all this however is the bus architecture. Intel's Core 2 Duo/Quad system uses a shared bus. Think of this like Ethernet or cable internet where there is only so much bandwidth to go round and every new participant just takes another share from the whole. Core i7 and newer Xeons use QuickPath, which is pretty similar to HyperTransport.

More cores will occupy less space, use less space and less power and cost less (unless you're using really low powered CPUs) both in per-core terms and the cost of other hardware (eg motherboards).

Generally speaking one CPU will the the cheapest (both in terms of hardware AND software). Commodity hardware can be used for this. Once you go to the second socket you tend to have to use different chipsets, more expensive motherboards and often more expensive RAM (eg ECC fully buffered RAM) so you take a massive cost hit going from one CPU to two. It's one reason so many large sites (including Flickr, Google and others) use thousands of commodity servers (although Google's servers are somewhat customized to include things like a 9V battery but the principle is the same).

Your edits don't really change much. "Performance" is a highly subjective concept. Performance at what? Bear in mind though that if your application isn't sufficiently multithreaded (or multiprocess) to take advantage of extra cores then you can actually decrease performance by adding more cores.

I/O bound applications probably won't prefer one over the other. They are, after all, bound by I/O not CPU.

For compute-based applications well it depends on the nature of the computation. If you're doing lots of floating point you may benefit far more by using a GPU to offload calculations (eg using Nvidia CUDA). You can get a huge performance benefit from this. Take a look at the GPU client for Folding@Home for an example of this.

In short, your question doesn't lend itself to a specific answer because the subject is complicated and there's just not enough information. Technical architecture is something that has to be designed for the specific application.

这篇关于哪个更有效?更多核心或更多CPU的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆