用于添加/求和数字的各个数字分量的最快方法 [英] Fastest method for adding/summing the individual digit components of a number

查看:161
本文介绍了用于添加/求和数字的各个数字分量的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在数学论坛上看到一个问题,一个人正在讨论一遍又一遍地将数字加起来直到达到一位数。 (即362将成为3 + 6 + 2,它将变为11......然后11将变为1 + 1将变为2,因此362将返回2 ...我写了一些很好的代码来得到答案并发布它只是被一个用户提出,建议模9中的任何数字都等于这个无限数字和,我检查了一下他是对的......好吧几乎是正确的,如果返回零,你必须用9将其切换出来,但这是一个非常快速的修复......

I saw a question on a math forum a while back where a person was discussing adding up the digits in a number over and over again until a single digit is achieved. (i.e. "362" would become "3+6+2" which would become "11"... then "11" would become "1+1" would would become "2" therefor "362" would return 2... I wrote some nice code to get an answer to this and posted it only to be outdone by a user who suggested that any number in modulo 9 is equal to this "infinite digit sum", I checked it an he was right... well almost right, if zero was returned you had to switch it out with a "9" but that was a very quick fix...

362 = 3 + 6 + 2 = 11 = 1 + 1 = 2

362 = 3+6+2 = 11 = 1+1 = 2

或...

362%9 = 2

362%9 = 2

Anways,mod9方法非常适合无限添加数字之和,直到你只剩下一个数字......但是只做一次(即362会只返回11)...有人能想到快速算法吗?

Anways, the mod9 method works fantastic for infinitely adding the sum of the digits until you are left with just a single digit... but what about only doing it once (i.e. 362 would just return "11")... Can anyone think of fast algorithms?

推荐答案

有一个很好的方法来总结1位数在二进制中,并使用固定宽度的整数。在每次迭代时,你将一半分开每个数字分成两个值,向下移位一个值,然后添加。第一次迭代,分开其他数字。第二次迭代,数字对,等等。

There's a cool trick for summing the 1 digits in binary, and with a fixed-width integer. At each iteration, you separate out half the digits each into two values, bit-shift one value down, then add. First iteration, separate ever other digit. Second iteration, pairs of digits, and so on.

鉴于27是00011011为8位二进制,过程是......

Given that 27 is 00011011 as 8-bit binary, the process is...

00010001 + 00000101 = 00010110  <-  every other digit step
00010010 + 00000001 = 00010011  <-  pairs of digits
00000011 + 00000001 = 00000100  <-  quads, giving final result 4

你可以用十进制做类似的技巧,但它的效率低于简单的循环,除非您直接表示十进制数,快速操作将所选数字归零并进行数字移位。因此,对于12345678,您得到...

You could do a similar trick with decimal, but it would be less efficient than a simple loop unless you had a direct representation of decimal numbers with fast operations to zero out selected digits and to do digit-shifting. So for 12345678 you get...

02040608 + 01030507 = 03071115  <-  every other digit
00070015 + 00030011 = 00100026  <-  pairs
00000026 + 00000010 = 00000036  <-  quads, final result

所以1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 = 36,这是正确的,但如果您的数字表示是固定宽度的十进制,则只能有效地执行此操作。它总是需要lg(n)次迭代,其中lg表示基数为2的对数,然后向上舍入。

So 1+2+3+4+5+6+7+8 = 36, which is correct, but you can only do this efficiently if your number representation is fixed-width decimal. It always takes lg(n) iterations, where lg means the base two logarithm, and you round upwards.

稍微扩展一下(根据评论中的讨论) ),让我们假装这是理智的,有点...

To expand on this a little (based on in-comments discussions), let's pretend this was sane, for a bit...

如果计算一位数的加法,实际上更多比这里的简单循环工作。与计数位的按位技巧一样,这个想法是重新排序这些加法(使用关联性),然后并行地计算尽可能多的加法,使用单个全宽加法来实现两个半宽加法,四个数字清除和数字移位操作会产生很大的开销,如果将其实现为循环(计算或查找每个步骤的数字屏蔽和移位距离值),则会产生更多的开销。 循环应该可以完全展开,并且这些掩码和移位距离作为常量包含在代码中以避免这种情况。

If you count single-digit additions, there's actually more work than a simple loop here. The idea, as with the bitwise trick for counting bits, is to re-order those additions (using associativity) and then to compute as many as possible in parallel, using a single full-width addition to implement two half-width additions, four quarter-width additions etc. There's significant overhead for the digit-clearing and digit-shifting operations, and even more if you implement this as a loop (calculating or looking up the digit-masking and shift-distance values for each step). The "loop" should probably be fully unrolled and those masks and shift-distances be included as constants in the code to avoid that.

支持二进制编码的十进制(BCD)可以处理这个问题。数字屏蔽和数字移位将使用位屏蔽和位移来实现,因为每个十进制数字将以4(或更多)位编码,与其他数字的编码无关。

A processor with support for Binary Coded Decimal (BCD) could handle this. Digit masking and digit shifting would be implemented using bit masking and bit shifting, as each decimal digit would be encoded in 4 (or more) bits, independent of the encoding of other digits.

有一个问题是,如今BCD支持非常少见。它曾经在8位和16位天相当普遍,但据我所知,现在仍支持它的处理器主要是为了向后兼容。原因包括......

One issue is that BCD support is quite rare these days. It used to be fairly common in the 8 bit and 16 bit days, but as far as I'm aware, processors that still support it now do so mainly for backward compatibility. Reasons include...


  1. 非常早期的处理器不包括硬件乘法和除法。对这些操作的硬件支持意味着现在将二进制转换为十进制更容易,更有效。二进制现在几乎用于所有事情,并且BCD几乎被遗忘。

  1. Very early processors didn't include hardware multiplication and division. Hardware support for these operations means it's easier and more efficient to convert binary to decimal now. Binary is used for almost everything now, and BCD is mostly forgotten.

图书馆周围有十进制数字表示,但是如果提供任何高级语言则很少对硬件BCD的便携支持,因为汇编程序停止成为大多数开发人员的真实选择BCD支持只是停止使用。

There are decimal number representations around in libraries, but few if any high level languages ever provided portable support to hardware BCD, so since assembler stopped being a real-world option for most developers BCD support simply stopped being used.

随着数字变大,甚至打包的BCD包装效率很低。数字表示基数10 ^ x具有基数10的最重要属性,并且很容易解码为十进制。基本1000只需要每三位数10位,而不是12位,因为2 ^ 10是1024.这足以显示你得到32位的额外十进制数字 - 9位而不是8位 - 你还剩下2位,例如对于一个符号位。

As numbers get larger, even packed BCD is quite inefficiently packed. Number representations base 10^x have the most important properties of base 10, and are easily decoded as decimal. Base 1000 only needs 10 bits per three digits, not 12, because 2^10 is 1024. That's enough to show you get an extra decimal digit for 32 bits - 9 digits instead of 8 - and you've still got 2 bits left over, e.g. for a sign bit.

事实上,对于这个数字总计算法来说,你需要它是值得的使用固定宽度十进制可能至少32位(8位)。这为(完全展开的)简单循环提供了12次操作(6次掩码,3次移位,3次添加)而不是15次添加。然而,这是一个临界收益 - 代码中的其他问题很容易意味着它实际上更慢。

The thing is, for this digit-totalling algorithm to be worthwhile at all, you need to be working with fixed-width decimal of probably at least 32 bits (8 digits). That gives 12 operations (6 masks, 3 shifts, 3 additions) rather than 15 additions for the (fully unrolled) simple loop. That's a borderline gain, though - and other issues in the code could easily mean it's actually slower.

效率增益更清晰,为64位(16位十进制数字),因为仍然只有16个操作(8个掩码,4个移位,4个添加)而不是31个,但找到支持64位BCD操作的处理器的几率似乎很小。即使你这样做了,你还需要多久一次?它似乎不太可能值得努力和失去便携性。

The efficiency gain is clearer at 64 bits (16 decimal digits) as there's still only 16 operations (8 masks, 4 shifts, 4 additions) rather than 31, but the odds of finding a processor that supports 64-bit BCD operations seems slim. And even if you did, how often do you need this anyway? It seems unlikely that it could be worth the effort and loss of portability.

这篇关于用于添加/求和数字的各个数字分量的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆