为什么比在旧的微处理器加法/减法运算位运算速度稍快? [英] Why were bitwise operations slightly faster than addition/subtraction operations on older microprocessors?

查看:201
本文介绍了为什么比在旧的微处理器加法/减法运算位运算速度稍快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我碰到这个摘录今天来:

I came across this excerpt today:

在大多数老的微处理器,位运算比略有增加更快,
  减法运算和通常比乘除显著更快
  操作。在现代的体系结构中,不是这种情况:位操作一般都是
  相同的速度加入(尽管仍然快于倍增)。

On most older microprocessors, bitwise operations are slightly faster than addition and subtraction operations and usually significantly faster than multiplication and division operations. On modern architectures, this is not the case: bitwise operations are generally the same speed as addition (though still faster than multiplication).

我很好奇,为什么位运算比加法/减法运算速度稍快于旧的微处理器。

I'm curious about why bitwise operations were slightly faster than addition/subtraction operations on older microprocessors.

所有我能想到的,会导致延迟的电路来实现加法/减法依赖于逻辑门(并行加法器和诸如此类的东西)的几个层次,而位运算有更简单的电路实现。这是什么原因?

All I can think of that would cause the latency is that the circuits to implement addition/subtraction depend on several levels of logic gates (parallel adders and whatnot), whereas the bitwise operations have far simpler circuit implementations. Is this the reason?

我知道算术和位操作上都现代处理器在一个时钟,cyle内执行,而纯粹谈到传播时间的电路,是延迟还是理论上有现代处理器?

I know arithmetic and bitwise operations both execute within one clock-cyle on modern processors, but speaking purely about propagation time for the circuit, is the latency still theoretically there in modern processors?

最后,我对按位移位运算的执行的概念ç问题:

Finally, I had a conceptual C question about the execution of the bitwise shift operation:

unsigned x = 1;
x <<= 5;

unsigned y = 0;
y += 32;

两者 X 应持有的价值 32 ,但它采取的 5 单独左移获得 X 来的值​​(如被逐位变化通过管道实现)?为了澄清,我问纯属对电路行为没有时钟周期数。

Both x and y should hold the value 32, but did it take 5 separate left shifts to get x to that value (as in are bitwise shifts implemented via pipes)? To clarify, I'm asking purely about the circuit behavior not the number of clock cycles.

推荐答案

在任何二进制位运算,每个输出位只取决于输入的两个对应的位。在一个加载操作时,每个输出位取决于在输入的相应的位和的所有位到右侧(朝向下的值)。

In any binary bitwise operation, each output bit depends only on the two corresponding bits in the inputs. In an add operation, each output bit depends on the corresponding bits in the inputs and all the bits to the right (toward lower values).

例如,的01111111 + 00000001最左边的位是1,但01111110 + 00000001最左边的位是0

For example, the leftmost bit of 01111111 + 00000001 is 1, but the leftmost bit of 01111110 + 00000001 is 0.

在其最简单的形式中,一个加法器将两个低位,并产生一个输出位和一进位。再下两个最低位,并将该进位加在,产生另一个输出位和另一个进位。这一过程不断重复。所以最高输出位是在增加了一个链的末端。如果通过位做手术一点,因为较老的处理器一样,那么它需要时间来到达终点。

In its simplest form, an adder adds the two low bits and produces one output bit and a carry. Then the next two lowest bits are added, and the carry is added in, producing another output bit and another carry. This repeats. So the highest output bit is at the end of a chain of adds. If you do the operation bit by bit, as older processors did, then it takes time to get to the end.

有一些方法可以加快这一些,通过将几个输入比特到更复杂的逻辑安排。但是,当然,这需要在芯片和更大的功率更大的面积。

There are ways to speed this up some, by feeding several input bits into more complicated logic arrangements. But that of course requires more area in the chip and more power.

现在的处理器具有用于执行各种各样工作加载,存储,加法,乘法,浮点运算,并且更的许多不同的单元。鉴于今天的能力,相对于其他的任务做一个附加的工作是小,所以它在单个处理器周期内符合。

Today‘s processors have many different units for performing various sorts of work—loads, stores, addition, multiplication, floating-point operations, and more. Given today’s capabilities, the work of doing an add is small compared to other tasks, so it fits within a single processor cycle.

也许在理论上可以使一个处理器做了位运算速度比一个插件。 (也有,至少在纸面上,异步操作,与做在自己的步伐工作不同单位异国情调的处理器。)然而,随着使用的设计,你需要一些常规的固定循环中的处理器负载来协调很多事情说明,它们调度到执行单元,将来自执行单元结果寄存器,等等,等等。一些执行单元都需要多个周期去完成他们的工作(例如,有些浮点单元大约需要四个周期做一个浮点加法)。所以,你可以有一个组合。然而,与当前的尺度,使得循环时间更小,使其按位操作而不是一个插件很可能不经济的。

Perhaps in theory you could make a processor that did a bitwise operation more quickly than an add. (And there are, at least on paper, exotic processors that operate asynchronously, with different units doing work at their own paces.) However, with the designs in use, you need some regular fixed cycle to coordinate many things in the processor—loading instructions, dispatching them to execution units, sending results from execution units to registers, and much, much more. Some execution units do require multiple cycles to complete their jobs (e.g., some floating-point units take about four cycles to do a floating-point add). So you can have a mix. However, with current scales, making the cycle time smaller so that it fits a bitwise operation but not an add is likely not economical.

这篇关于为什么比在旧的微处理器加法/减法运算位运算速度稍快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆