哪个是实现此算法的更好方法? [英] Which is a better way to implement this algorithm?

查看:78
本文介绍了哪个是实现此算法的更好方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨。

(因为我已经有了与C ++语言和一般编程相关的
问题,所以对comp.lang.c ++和comp.programming都进行了测试)


我有以下C ++代码。第一个例程在第二个例程的时间内以65%的

运行。但两者都做同样的事情。但是,

第二个在代码编写方式上看起来更好

因为它有助于在内循环中更好地封装转换

至少在我看来,它更容易阅读。也许它不是,

也许前者是更好的我应该这样做,但如果

后者是更好的话。在这方面,我应该放弃它无论如何

和性能权衡,因为我希望这个东西快点???


每个例程的作用是多少两个任意精度的整数

在一起。第二个虽然使用了另外的切片。类型

提供了一个窗口,可以实现乘法和加法。操作是在有限的数字范围内执行,然后可以在数字上进行
,明确这部分算法。


我正在使用简单的小学乘法算法。注意

第二个例程如何更容易概述这个算法,而在第一个

中它更难以看到。你更喜欢哪一种?


为了简洁起见,省略了类别定义和其他成员。

但是要弄清楚它们应该不会太难必要的

信息。


此外,我可以在此代码中丢失类型转换器还是需要它们?


我之所以问的原因是因为我记得早些时候被某人(Alf P. Steinbach,小组:comp.lang.c ++)告诉我们

我的

我的程序的最后一次实现(这是一个分形生成器)

有一些坏的,引起错误的抽象差距。然而我无法获得足够的详细回复,所以我或多或少地试图找出如何让它变得更好(尽管

他确实给了我一些*示例*哪里有问题,我已经修复了b $ b。)。但似乎我失去了表现,而且对于我的应用来说,这不是好事。更不用说我原来的

bignum实现被称为愚蠢。同样(......虽然我没有看到愚蠢的bignum类,但不管它的错是什么......ref:
http://groups.google.com/group /comp....9?dmode=source

没有任何关于它究竟是多么愚蠢的解释。所以我

不得不开始黑暗拍摄那里试图找出我所做的事情

做错了。


第一版(最快):

---

void RawDigit :: Mul(const RawDigit& a,const RawDigit& b)

{

/ *如果(& a == this)
$ b $,请检查我们是否正在进行就地乘法* /

b {

MulInPlace(b);

返回;

}否则if(& b == this){

MulInPlace(a);

返回;

}


/ *获取长度。 * /

std :: size_t rLength(GetLength());

std :: size_t aLength(a.GetLength());

std :: size_t bLength(b.GetLength());


/ *将此归零。 * /

归零();


/ *进行乘法运算。 * /

TwoDigit64携带(0);

for(std :: size_t i(0); i< aLength&& i< rLength; i ++)

{

carry = 0;


for(std :: size_t j(0); j< bLength&&( i + j)< rLength; j ++)

{

TwoDigit64

tmp((static_cast< TwoDigit64>(a.digits [i] )* b.digits [j])+

位数[i + j] +随身携带);

carry = tmp> DIGIT_BITS;

tmp - = carry<< DIGIT_BITS;

位数[i + j] = tmp;

}


if(i + bLength< rLength)

位数[i + bLength] =携带;

}

}

---


第二个版本(较慢):

---

***切片操作。

内联Digit32 MulAddOp( Digit32 a,Digit32 b,Digit32 c,Digit32

& carry)

{

TwoDigit64 sum(a +(static_cast< TwoDigit64>( b)* c)+ carry);

carry = sum> DIGIT_BITS;

return(sum& MAX_DIGIT);

}


inline Digit32 MulAddCarryPropOpL(Digit32 a,Digit32& carry)

{

Digit32 sum(a + carry);

carry = sum<随身携带;

返回(总和);

}


内联数字32 MulAddCarryPropOpR(Digit32 b,Digit32 c,Digit32
& carry)

{

TwoDigit64 sum((static_cast< TwoDigit64>(b)* c)+ carry);

carry = sum> DIGIT_BITS;

return(sum& MAX_DIGIT);

}


Digit32 RawDigitSlice :: MulAdd(const ConstRawDigitSlice& a,

const Digit32 b,

const ConstRawDigitSlice& c)

{

/ *设置迭代器* /

DigitIterator32 ri(GetStartIterator());

ConstDigitIterator32 ai(a.GetConstStartIterator());

ConstDigitIterator32 ci(c.GetConstStartIterator());


/ *获取a和c的最小长度。 * /

std :: size_t minLength(std :: min(std :: min(a.length,c.length),

length));


/ *组合乘法+加* /

Digit32随身携带(0);

std :: size_t i(0);

for(i; i< minLength; ++ i,++ ri,++ ai,++ ci)

* ri = MulAddOp(* ai,b, * ci,carry);


/ *处理a或b的剩余部分。 * /

int greaterLength(std :: min(std :: max(a.length,c.length),length));

if(a.length> ; = c.length)

{

for(i; i< largerLength; ++ i,++ ri,++ ai)

* ri = MulAddCarryPropOpL(* ai,carry);

} else {

for(i; i< largerLength; ++ i,++ ri,++ ci )

* ri = MulAddCarryPropOpR(b,* ci,carry);

}


/ *完成进位传播* /

if(greaterLength< length)

{

for(i; i< length; ++ i,++ ri)

* ri = MulAddCarryPropOpL(0,携带);

}


/ *完成! * /

返回(随身携带);

}


***下一个例程是在不同的源文件中,一个

实现完整的RawDigit class。

***前者将在另一个文件中实现

" RawDigitSlice" class.

void RawDigit :: Mul(const RawDigit& a,const RawDigit& b)

{

/ *查看如果我们正在进行就地乘法* /

if(& a == this)

{

MulInPlace(b );

返回;

}否则if(& b == this){

MulInPlace(a);

返回;

}


/ *获取长度。 * /

std :: size_t rLength(GetLength());

std :: size_t aLength(a.GetLength());

std :: size_t bLength(b.GetLength());


/ *将此归零。 * /

归零();


/ *设置切片。 * /

RawDigitSlice runningSum(* this,0,aLength + 1); / *说明:

(< RawDigit对象>,< origin digit idx>,<切片中的数字>>)* /

ConstRawDigitSlice as(a) ;


/ *做乘法。 * /

for(std :: size_t i(0); i< bLength&& i< rLength; ++ i,++ runningSum)

acc。 MulAdd(runningSum,b [i],as);

}

---

解决方案
" mike3" < mi ****** @ yahoo.com在留言中写道

新闻:0a ************************ ********** @ q24g2000 prf.googlegroups.com ...

:嗨。

:(发布到comp.lang.c ++和comp.programming,因为我有

:与C ++语言和一般编程相关的问题)



:我已经得到以下C ++代码。第一个例程运行在65%的

:第二个例程的时间。但两者都做同样的事情。然而,

:第二个似乎在编写代码的方式上更好

:因为它有助于在内循环中封装转换更好

:至少在我看来,让它更容易阅读。也许它不是,

:也许前者是更好的我应该这样做,但如果

:后者是更好的在这方面,我应该放弃它无论如何

:和性能权衡,因为我希望这个东西很快???

作为这两个实现的首次读者,我发现第一个比第二个更容易理解和维护的
。添加

级别的抽象而没有明显的好处只会混淆代码。


:我正在使用简单的小学。乘法算法。注意

:第二个例程如何更容易概述这个算法,而在第一个例子中,b $ b:它有点难以看清。你更喜欢哪一个?

第一个。


:另外,我可以在这段代码中丢失类型转换器还是我需要它们?

我会评论&回顾下面的第一个函数。

在某些时候,你需要(至少含蓄地)将一个操作数转换为更大的类型,因为这可能不会发生。 />
自动为C ++中的某些类型组合。


:第一个版本(最快):

:---

:void RawDigit :: Mul(const RawDigit& a,const RawDigit& b)

:{

:/ *检查我们是否正在进行就地倍增* /

:if(& a == this)

:{

:MulInPlace( b);

:返回;

:}否则if(& b == this){

:MulInPlace(a);

:返回;

:}

一般班级设计:除非有确定的理由

不这样做,我会使用运算符重载并实现

(作为Num类的成员):

静态Num运算符*(const Num& a,const Num& b);

Num& operator *(const Num& a);

也许RawDigit :: Mul是一个内部私人会员,并且提供了

以上的操作?但是特殊情况

(例如乘法到位)最好在上面的

层中处理...


:/ *获取长度。 * /

:std :: size_t rLength(GetLength());

:std :: size_t aLength(a.GetLength());

:std :: size_t bLength(b.GetLength());



:/ *将此归零。 * /

:归零();


意图真正实现模数学,并允许

结果为被截断?如果不是,结果应该是

自动调整大小,代码也简化了。


:/ *做乘法。 * /

:TwoDigit64随身携带(0);

从技术上讲,随身携带应该只需要1位数,

并且可以在外部定义循环如下。


:for(std :: size_t i(0); i< aLength&& i< rLength; i ++)

:{

:carry = 0;



:for(std :: size_t j(0); j< bLength&&( i + j)< rLength; j ++)

:{

:TwoDigit64

:tmp((static_cast< TwoDigit64>(a.digits) [i])* b.digits [j])+

:digits [i + j] + carry);

:carry = tmp> DIGIT_BITS;

:tmp - = carry<< DIGIT_BITS;

:digits [i + j] = tmp;

:}



:if(if) i + bLength< rLength)

:digits [i + bLength] = carry;

:}

:}


假设您的班级中有以下与数字相关的

定义:

typedef ...数字;

typedef ... TwoDigit;

const unsigned digitBitCount = ...;

const数字digitBitMask = 0xF ... UL;


假设没有截断(由于足够的预分配

的结果数字),相同的算法可以写成:


for(unsigned aPos = 0; aPos< aLength; ++ aPos)

{

数字进位= 0;

TwoDigit const aDig = a .digits [APOS]; // NB:施放隐藏这里

for(unsigned bPos = 0; bPos< bLength; ++ bPos)

{

TwoDigit mul = aDig * b.digits [ bPos]

+ this-> digits [aPos + bPos]

+ carry;


this-> digits [ aPos + bPos] = mul& digitBitMask;

carry =(mul> digitBitCount);

}

this-> digits [aPos + bPos] = carry;

}


有很多正确的方法来写这个,有些是

可能比这个例子更好。但是我希望你能发现它很有用。

无论如何,考虑到这个循环非常可接受的复杂性,

我不会打扰它或添加

封装层。


问候 - 伊万

-
<一个rel =nofollowhref =http://ivan.vecerina.com/contact/?subject=NG_POST\"target =_ blank> http://ivan.vecerina.com/contact/?subject=NG_POST < - 电子邮件联系表格

Brainbench MVP for C ++< http:// www.brainbench.com


4月20日,08:26,mike3< mike4 ... @ yahoo.comwrote:


嗨。

(对于comp.lang.c ++和comp.programming,我已经有了

与C ++语言和通用编程相关的问题)


我有以下C ++代码。第一个例程在第二个例程的时间内以65%的

运行。但两者都做同样的事情。但是,

第二个在代码编写方式上看起来更好

因为它有助于在内循环中更好地封装转换

至少在我看来,它更容易阅读。也许它不是,

也许前者是更好的我应该这样做,但如果

后者是更好的话。在这方面,我应该放弃它无论如何

和性能权衡,因为我希望这个东西快点???


每个例程的作用是多少两个任意精度的整数

在一起。第二个虽然使用了另外的切片。类型

提供了一个窗口,可以实现乘法和加法。操作是在有限的数字范围内执行,然后可以在数字上进行
,明确这部分算法。


我正在使用简单的小学乘法算法。注意

第二个例程如何更容易概述这个算法,而在第一个

中它更难以看到。你更喜欢哪一种?


为了简洁起见,省略了类别定义和其他成员。

但是要弄清楚它们应该不会太难必要的

信息。


此外,我可以在此代码中丢失类型转换器还是需要它们?


我之所以问的原因是因为我记得早些时候被某人(Alf P. Steinbach,小组:comp.lang.c ++)告诉我们

我的

我的程序的最后一次实现(这是一个分形生成器)

有一些坏的,引起错误的抽象差距。然而我无法获得足够的详细回复,所以我或多或少地试图找出如何让它变得更好(尽管

他确实给了我一些*示例*哪里有问题,我已经修复了b $ b。)。但似乎我失去了表现,而且对于我的应用来说,这不是好事。更不用说我原来的

bignum实现被称为愚蠢。同样(......虽然我没有看到愚蠢的bignum类,不管它的错是什么......ref: http://groups.google.com/group/comp .... 38d0b8dab9?dmo ...)

没有任何解释,究竟是什么让它如此愚蠢。所以我

不得不开始黑暗拍摄那里试图找出我所做的事情

做错了。


第一版(最快):

---

void RawDigit :: Mul(const RawDigit& a,const RawDigit& b)

{

/ *如果(& a == this)
$ b $,请检查我们是否正在进行就地乘法* /

b {

MulInPlace(b);

返回;

}否则if(& b == this){

MulInPlace(a);

返回;

}


/ *获取长度。 * /

std :: size_t rLength(GetLength());

std :: size_t aLength(a.GetLength());

std :: size_t bLength(b.GetLength());


/ *将此归零。 * /

归零();


/ *进行乘法运算。 * /

TwoDigit64携带(0);

for(std :: size_t i(0); i< aLength&& i< rLength; i ++)

{

carry = 0;


for(std :: size_t j(0); j< bLength&&( i + j)< rLength; j ++)

{

TwoDigit64

tmp((static_cast< TwoDigit64>(a.digits [i] )* b.digits [j])+

位数[i + j] +随身携带);

carry = tmp> DIGIT_BITS;

tmp - = carry<< DIGIT_BITS;

位数[i + j] = tmp;

}


if(i + bLength< rLength)

位数[i + bLength] =携带;

}}



[snip第二版(慢)]


我认为第一个版本更容易阅读。我更喜欢在一页上看到

主算法而不是数百万小的

方法。我缺少的是:

- 标志。你的所有价值似乎都是无符号的。

你想使用两个'的补码表示或

符号+幅度。

- The管理规模。函数

的用户不应该为调整大整数而烦恼。


您是否知道已存在大整数库。一些

的人已经承担了为大整数

函数编写库的负担。例如:我。

如果您下载Seed7

http://sourceforge.net/project/showf...roup_id=151126

您将看到文件seed7 / src / big_rtl.c包含一个用C编写的bigInteger

库。这个库自动管理

数字的内存并包含通常的算术运算

(包括两种形式的bigInteger除法和余数,其中
截断为零或负无限)。乘法在可能的情况下使用了Karatsuba算法



BTW。:我计划今天做一个改进版本(巧合我

做了好几个bigInteger的改进)。如果你等待大约

12小时你可以获得新版本。


问候Thomas Mertes


Seed7主页: http://seed7.sourceforge.net

Seed7 - 可扩展的编程语言:用户定义的语句

和运算符,抽象数据类型,没有特殊的模板
语法,带接口的OO和多个调度,静态类型,

解释或编译,可移植,在linux / unix / windows下运行。


4月20日,1:43 * am,Ivan Vecerina。

< _INVALID_use_webfo ... @ ivan.vecerina.comwrote:


" mike3" < mike4 ... @ yahoo.com写了留言


新闻:0a *********************** *********** @ q24g2000 prf.googlegroups.com ...

:嗨。

:( Xposted to comp.lang.c ++和comp.programming,因为我有

:与C ++语言和一般编程有关的问题)



:我''我有以下C ++代码。第一个例程运行在65%的

:第二个例程的时间。但两者都做同样的事情。然而,

:第二个似乎在编写代码的方式上更好

:因为它有助于在内循环中封装转换更好

:至少在我看来,让它更容易阅读。也许它不是,

:也许前者是更好的我应该这样做,但如果

:后者是更好的在这方面,我应该放弃它无论如何

:和性能权衡,因为我希望这个东西很快???

作为这两个实现的首次读者,我发现第一个比第二个更容易理解和维护的
。 *添加

级别的抽象而没有明显的好处只会混淆代码。



好​​的,那么我会去第一个。特别是考虑到它更快......


:我正在使用简单的小学。乘法算法。注意

:第二个例程如何更容易概述这个算法,而在第一个例子中,b $ b:它有点难以看清。你更喜欢哪一个?

第一个。



嗯。


:另外,我可以在这段代码中丢失类型转换器吗?或者我需要它们吗?

我会评论&回顾下面的第一个函数。

在某些时候,你需要(至少含蓄地)将一个操作数转换为更大的类型,因为这可能不会发生。 />
自动为某些类型的组合,用C ++。



那么在这种情况下,演员阵容没问题,对吧?


:第一个版本(最快):

:---

:void RawDigit :: Mul(const RawDigit& a, const RawDigit& b)

:{

:* * / *检查我们是否正在进行就地乘法* /

:* * if(& a == this)

:* * {

:* * * MulInPlace(b);

:* * *返回;

:* *}否则if(& b == this){

:* * * MulInPlace(a);

:* * *返回;

:* *}

一般课程设计:除非有确定的理由

不这样做,我会使用运算符重载并实现

(作为Num类的成员):

* s tatic * Num * operator *(const Num& a,const Num& b);

* * * * * Num& * operator *(const Num& a);

也许RawDigit :: Mul是一个内部私人会员,并且提供了

以上的操作? *但是特殊情况

(例如乘法到位)最好在上面的

层中处理......



我已经尝试了重叠的运算符,但似乎必须构建

a临时副本以保存结果,以便在二元

运算符的情况下返回。这对我的应用程序来说是个大问题。因为我有表现

关注。数十亿次释放和重新分配内存

是对性能的严重浪费。 (这些例程将被称为

数十亿次。)


因此,我会做相反的事情并首先构建这些低级例程,

然后使用它们构建重载运算符,然后可以在程序的所有非性能关键区域中使用
。如果它变成

,我不需要将MulInPlace()功能与Mul()集成

,我会删除它并记录下来Mul()没有

就地工作。这是可接受的做法吗?


:* * / *获取长度。 * /

:* * std :: size_t rLength(GetLength());

:* * std :: size_t aLength(a.GetLength());

:* * std :: size_t bLength(b.GetLength());



:* * / *将此归零。 * /

:* *归零();


目的是真正实现模数学,并允许

结果被截断?如果不是,结果应该是自动调整大小,并且代码被简化。



所以我可以省略这个功能并且只记录

例程不像大多数人预期的那样表现吗? (如果我有操作数

有3种不同的长度我通常会期望Mul()尊重这些长度但是你的意见可能会有所不同。) />

此外上面只是清除结果缓冲区,

你需要添加(如算术添加)

数字给它。你需要使用乘法算法才能工作。


你需要得到输入操作数的长度因为

你要循环数字,不?你不想在缓冲区外读取
或读取太少的数字?我想你会的,

但是又一次也许我错了......


:* * / *做乘法。 * /

:* * TwoDigit64随身携带(0);

从技术上讲,随身携带应该只需要1位数,

并且可以在下面的外部循环。



但是我认为这会造成一次性能浪费的转换

一次又一次地从32/64转换。


:* * for(std :: size_t i(0); i< aLength&& i< rLength; i ++)

:* * {

:* * * carry = 0;



:* * * for(std :: size_t j (0); j< bLength&&(i + j)< rLength; j ++)

:* * * {

:* * * * * TwoDigit64

:tmp((static_cast< TwoDigit64>(a.digits [i])* b.digits [j])+

:* * * * * * * * * * * *位数[i + j] +随身携带);

:* * * * * carry = tmp> DIGIT_BITS;

:* * * * * tmp - = carry<< DIGIT_BITS;

:* * * * * digits [i + j] = tmp;

:* * *}



:* * * if(i + bLength< rLength)

:* * * * digits [i + bLength] = carry;

:* *}

:}


假设您的班级中有以下与数字相关的

定义:

* typedef ... *数字;

* typedef ... * TwoDigit;

* const unsigned digitBitCount = ...;

* const数字digitBitMask = 0xF ... UL;


假设没有截断(因为有足够的预分配

结果数字),相同的算法可以写成:


* for(unsigned aPos = 0; aPos< aLength; ++ aPos)

* {

* *数字进位= 0;

* * TwoDigit const aDig = a.digits [aPos]; // NB:施放隐藏这里

* *表示(无符号bPos = 0; bPos< bLength; ++ bPos)

* * {

* * * * TwoDigit mul = aDig * b.digits [bPos]

* * * * * * * * * * * + this-> digits [aPos + bPos]

* * * * * * * * * * * +随身携带;


* * * * this-> digits [aPos + bPos] = * mul& * digitBitMask;

* * * * carry * * * * * * * * * =(mul> digitBitCount);

* *}

* * this-> digits [aPos + bPos] = carry;

*}


有很多正确的方法可以写这个,还有一些

可能比这个例子更好。 *但是我

希望你会发现它很有用。

无论如何,考虑到这个循环非常可接受的复杂性,

我会没有打扰它或添加

封装层。



好​​吧,然后。


Hi.
(Xposted to both comp.lang.c++ and comp.programming since I''ve got
questions related to both C++ language and general programming)

I''ve got the following C++ code. The first routine runs in like 65% of
the time of the second routine. Yet both do the same thing. However,
the second one seems better in terms of the way the code is written
since it helps encapsulate the transformation in the inner loop better
making it easier to read, at least in my opinion. Maybe it''s not,
maybe the former is "better" that way and I should go with it, but if
the latter is "better" in that respect should I just ditch it anyway
and tradeoff for performance since I want this thing to be fast???

What each routine does is multiply two arbitrary-precision integers
together. The second one though uses an additional "slice" type that
provides a window enabling the "multiply and add" operation to be
performed on a limited range of digits, which can then be advanced
across the number, making clear that part of the algorithn.

I''m using the simple "grade school" multiply algorithm. Note how the
second routine more easily outlines this algorithm, while in the first
it is a little more difficult to see. Which would you prefer, exactly?

For brevity, class definitions and other members have been omitted.
However it should not be too difficult to figure out the necessary
information.

Also, could I lose the typecasts in this code or do I need them?

The reason why I''m asking is because I remembered getting told earlier
by someone here (Alf P. Steinbach, group: comp.lang.c++) about how my
last implementation of my program (this is for a fractal generator)
had some sort of bad, bug-inducing "abstraction gap" yet I was unable
to get enough elaboration responses about it so I''ve been more or less
shooting around trying to figure out how to make it better (although
he did give me some *examples* of where there were problems, which I
have since fixed.). But it seems I''m losing performance and that''s not
a good thing for my application. Not to mention also that my original
bignum implementation was called "silly" as well("...although I didn''t
look at the silly bignum class, whatever it''s fault is..." ref:
http://groups.google.com/group/comp....9?dmode=source)
without any explanation as to what exactly was so silly about it. So I
had to start dark-shooting there too trying to figure out what I had
done wrong.

First version (fastest):
---
void RawDigit::Mul(const RawDigit &a, const RawDigit &b)
{
/* Check to see if we''re doing an in-place multiply */
if(&a == this)
{
MulInPlace(b);
return;
} else if(&b == this) {
MulInPlace(a);
return;
}

/* Get lengths. */
std::size_t rLength(GetLength());
std::size_t aLength(a.GetLength());
std::size_t bLength(b.GetLength());

/* Zeroize this. */
Zeroize();

/* Do the multiplication. */
TwoDigit64 carry(0);
for(std::size_t i(0);i<aLength && i<rLength;i++)
{
carry = 0;

for(std::size_t j(0);j<bLength && (i+j)<rLength;j++)
{
TwoDigit64
tmp((static_cast<TwoDigit64>(a.digits[i])*b.digits[j]) +
digits[i+j] + carry);
carry = tmp >DIGIT_BITS;
tmp -= carry << DIGIT_BITS;
digits[i+j] = tmp;
}

if(i+bLength < rLength)
digits[i+bLength] = carry;
}
}
---

Second version (slower):
---
*** Slice manipulation.
inline Digit32 MulAddOp(Digit32 a, Digit32 b, Digit32 c, Digit32
&carry)
{
TwoDigit64 sum(a + (static_cast<TwoDigit64>(b)*c) + carry);
carry = sum >DIGIT_BITS;
return(sum & MAX_DIGIT);
}

inline Digit32 MulAddCarryPropOpL(Digit32 a, Digit32 &carry)
{
Digit32 sum(a + carry);
carry = sum < carry;
return(sum);
}

inline Digit32 MulAddCarryPropOpR(Digit32 b, Digit32 c, Digit32
&carry)
{
TwoDigit64 sum((static_cast<TwoDigit64>(b)*c) + carry);
carry = sum >DIGIT_BITS;
return(sum & MAX_DIGIT);
}

Digit32 RawDigitSlice::MulAdd(const ConstRawDigitSlice &a,
const Digit32 b,
const ConstRawDigitSlice &c)
{
/* Set up iterators */
DigitIterator32 ri(GetStartIterator());
ConstDigitIterator32 ai(a.GetConstStartIterator());
ConstDigitIterator32 ci(c.GetConstStartIterator());

/* Get minimum length of a and c. */
std::size_t minLength(std::min(std::min(a.length, c.length),
length));

/* Do the combined multiply + add */
Digit32 carry(0);
std::size_t i(0);
for(i;i<minLength;++i,++ri,++ai,++ci)
*ri = MulAddOp(*ai, b, *ci, carry);

/* Handle the remaining part(s) of a or b. */
int largerLength(std::min(std::max(a.length, c.length), length));
if(a.length >= c.length)
{
for(i;i<largerLength;++i,++ri,++ai)
*ri = MulAddCarryPropOpL(*ai, carry);
} else {
for(i;i<largerLength;++i,++ri,++ci)
*ri = MulAddCarryPropOpR(b, *ci, carry);
}

/* Finish carry propagation */
if(largerLength < length)
{
for(i;i<length;++i,++ri)
*ri = MulAddCarryPropOpL(0, carry);
}

/* Done! */
return(carry);
}

*** This next routine is in a different source file, the one
implementing the full "RawDigit" class.
*** The former would be in another file that implements the
"RawDigitSlice" class.
void RawDigit::Mul(const RawDigit &a, const RawDigit &b)
{
/* Check to see if we''re doing an in-place multiply */
if(&a == this)
{
MulInPlace(b);
return;
} else if(&b == this) {
MulInPlace(a);
return;
}

/* Get lengths. */
std::size_t rLength(GetLength());
std::size_t aLength(a.GetLength());
std::size_t bLength(b.GetLength());

/* Zeroize this. */
Zeroize();

/* Set up slices. */
RawDigitSlice runningSum(*this, 0, aLength + 1); /* explanation:
(<RawDigit object>, <origin digit idx>, <nbr of digits in slice>) */
ConstRawDigitSlice as(a);

/* Do the multiplication. */
for(std::size_t i(0);i<bLength && i<rLength;++i,++runningSum)
acc.MulAdd(runningSum, b[i], as);
}
---

解决方案

"mike3" <mi******@yahoo.comwrote in message
news:0a**********************************@q24g2000 prf.googlegroups.com...
: Hi.
: (Xposted to both comp.lang.c++ and comp.programming since I''ve got
: questions related to both C++ language and general programming)
:
: I''ve got the following C++ code. The first routine runs in like 65% of
: the time of the second routine. Yet both do the same thing. However,
: the second one seems better in terms of the way the code is written
: since it helps encapsulate the transformation in the inner loop better
: making it easier to read, at least in my opinion. Maybe it''s not,
: maybe the former is "better" that way and I should go with it, but if
: the latter is "better" in that respect should I just ditch it anyway
: and tradeoff for performance since I want this thing to be fast???
As a first-time reader of both implementations, I find the first one
much easier to understand and maintain than the second one. Adding
levels of abstractions without a clear benefit only obfuscates code.

: I''m using the simple "grade school" multiply algorithm. Note how the
: second routine more easily outlines this algorithm, while in the first
: it is a little more difficult to see. Which would you prefer, exactly?
The first one.

: Also, could I lose the typecasts in this code or do I need them?
I''ll comment & review the first function below.
At some point, you do need to (at least implicitly) cast
an operand to the larger type, as this might not happen
automatically for some combinations of types, in C++.

: First version (fastest):
: ---
: void RawDigit::Mul(const RawDigit &a, const RawDigit &b)
: {
: /* Check to see if we''re doing an in-place multiply */
: if(&a == this)
: {
: MulInPlace(b);
: return;
: } else if(&b == this) {
: MulInPlace(a);
: return;
: }
General class design: unless there is an established reason
not to do so, I would use operator overloading and implement
(as members of the Num class):
static Num operator *( const Num& a, const Num& b );
Num& operator *( const Num& a );
Maybe RawDigit::Mul is an internal private member, and the
above operations are provided? But then the special cases
(e.g. multiply in place) would best be handled in the
layers above...

: /* Get lengths. */
: std::size_t rLength(GetLength());
: std::size_t aLength(a.GetLength());
: std::size_t bLength(b.GetLength());
:
: /* Zeroize this. */
: Zeroize();

Is the intent truly to implement modulo math, and to allow
the result to be truncated? If not the result should be
resized automatically, and the code is simplified.

: /* Do the multiplication. */
: TwoDigit64 carry(0);
Technically, the carry should only require 1 digit,
and can be defined within the outer loop below.

: for(std::size_t i(0);i<aLength && i<rLength;i++)
: {
: carry = 0;
:
: for(std::size_t j(0);j<bLength && (i+j)<rLength;j++)
: {
: TwoDigit64
: tmp((static_cast<TwoDigit64>(a.digits[i])*b.digits[j]) +
: digits[i+j] + carry);
: carry = tmp >DIGIT_BITS;
: tmp -= carry << DIGIT_BITS;
: digits[i+j] = tmp;
: }
:
: if(i+bLength < rLength)
: digits[i+bLength] = carry;
: }
: }

Let''s say that you have the following digit-related
definitions within your class:
typedef ... Digit;
typedef ... TwoDigit;
const unsigned digitBitCount = ...;
const Digit digitBitMask = 0xF...UL;

Assuming no truncation (because of adequate pre-allocation
of result digits), the same algorithm can be written as:

for( unsigned aPos = 0 ; aPos<aLength ; ++aPos )
{
Digit carry = 0;
TwoDigit const aDig = a.digits[aPos]; //NB: cast "hidden" here
for( unsigned bPos = 0 ; bPos<bLength ; ++bPos )
{
TwoDigit mul = aDig * b.digits[bPos]
+ this->digits[aPos+bPos]
+ carry;

this->digits[aPos+bPos] = mul & digitBitMask;
carry = ( mul >digitBitCount );
}
this->digits[aPos+bPos] = carry;
}

There are many correct ways to write this, and some are
probably better is some ways than this example. But I
hope that you will find it useful.
In any case, given the very acceptable complexity of this loop,
I would not bother breaking it up or adding layers of
encapsulation.

Regards -Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <http://www.brainbench.com


On 20 Apr., 08:26, mike3 <mike4...@yahoo.comwrote:

Hi.
(Xposted to both comp.lang.c++ and comp.programming since I''ve got
questions related to both C++ language and general programming)

I''ve got the following C++ code. The first routine runs in like 65% of
the time of the second routine. Yet both do the same thing. However,
the second one seems better in terms of the way the code is written
since it helps encapsulate the transformation in the inner loop better
making it easier to read, at least in my opinion. Maybe it''s not,
maybe the former is "better" that way and I should go with it, but if
the latter is "better" in that respect should I just ditch it anyway
and tradeoff for performance since I want this thing to be fast???

What each routine does is multiply two arbitrary-precision integers
together. The second one though uses an additional "slice" type that
provides a window enabling the "multiply and add" operation to be
performed on a limited range of digits, which can then be advanced
across the number, making clear that part of the algorithn.

I''m using the simple "grade school" multiply algorithm. Note how the
second routine more easily outlines this algorithm, while in the first
it is a little more difficult to see. Which would you prefer, exactly?

For brevity, class definitions and other members have been omitted.
However it should not be too difficult to figure out the necessary
information.

Also, could I lose the typecasts in this code or do I need them?

The reason why I''m asking is because I remembered getting told earlier
by someone here (Alf P. Steinbach, group: comp.lang.c++) about how my
last implementation of my program (this is for a fractal generator)
had some sort of bad, bug-inducing "abstraction gap" yet I was unable
to get enough elaboration responses about it so I''ve been more or less
shooting around trying to figure out how to make it better (although
he did give me some *examples* of where there were problems, which I
have since fixed.). But it seems I''m losing performance and that''s not
a good thing for my application. Not to mention also that my original
bignum implementation was called "silly" as well("...although I didn''t
look at the silly bignum class, whatever it''s fault is..." ref:http://groups.google.com/group/comp....38d0b8dab9?dmo...)
without any explanation as to what exactly was so silly about it. So I
had to start dark-shooting there too trying to figure out what I had
done wrong.

First version (fastest):
---
void RawDigit::Mul(const RawDigit &a, const RawDigit &b)
{
/* Check to see if we''re doing an in-place multiply */
if(&a == this)
{
MulInPlace(b);
return;
} else if(&b == this) {
MulInPlace(a);
return;
}

/* Get lengths. */
std::size_t rLength(GetLength());
std::size_t aLength(a.GetLength());
std::size_t bLength(b.GetLength());

/* Zeroize this. */
Zeroize();

/* Do the multiplication. */
TwoDigit64 carry(0);
for(std::size_t i(0);i<aLength && i<rLength;i++)
{
carry = 0;

for(std::size_t j(0);j<bLength && (i+j)<rLength;j++)
{
TwoDigit64
tmp((static_cast<TwoDigit64>(a.digits[i])*b.digits[j]) +
digits[i+j] + carry);
carry = tmp >DIGIT_BITS;
tmp -= carry << DIGIT_BITS;
digits[i+j] = tmp;
}

if(i+bLength < rLength)
digits[i+bLength] = carry;
}}

[snip Second version (slower)]

I consider the first version easier to read. I prefer to see the
main algorithm at one page instead of "millions of small
methods". What I am missing is:
- The signs. All your values seem to be unsigned.
Do you want to use two''s complement representation or
sign + magnitude.
- The management of the size. The user of the functions
should not be bothered with resizing the big integers.

Did you know that big integer libraries already exist. Some
people have taken the burden to write a library for big integer
functions. For example: Me.
If you download Seed7
(http://sourceforge.net/project/showf...roup_id=151126)
you will see that the file seed7/src/big_rtl.c contains a bigInteger
library written in C. This library manages the memory for the
digits automatically and contains the usual arithmetic operations
(inclusive two forms of bigInteger division and remainder which
trunc towards zero or minus infinite). The multiplication uses
the Karatsuba algorithm when possible.
BTW.: I plan to do an improved release today (By coincidence I
did several improvements for bigInteger''s). If you wait for approx.
12 hours you can get the new version.

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.


On Apr 20, 1:43*am, "Ivan Vecerina"
<_INVALID_use_webfo...@ivan.vecerina.comwrote:

"mike3" <mike4...@yahoo.comwrote in message

news:0a**********************************@q24g2000 prf.googlegroups.com...
: Hi.
: (Xposted to both comp.lang.c++ and comp.programming since I''ve got
: questions related to both C++ language and general programming)
:
: I''ve got the following C++ code. The first routine runs in like 65% of
: the time of the second routine. Yet both do the same thing. However,
: the second one seems better in terms of the way the code is written
: since it helps encapsulate the transformation in the inner loop better
: making it easier to read, at least in my opinion. Maybe it''s not,
: maybe the former is "better" that way and I should go with it, but if
: the latter is "better" in that respect should I just ditch it anyway
: and tradeoff for performance since I want this thing to be fast???
As a first-time reader of both implementations, I find the first one
much easier to understand and maintain than the second one. *Adding
levels of abstractions without a clear benefit only obfuscates code.

OK, then I''ll go for the first. Especially considering it''s faster...

: I''m using the simple "grade school" multiply algorithm. Note how the
: second routine more easily outlines this algorithm, while in the first
: it is a little more difficult to see. Which would you prefer, exactly?
The first one.

Hmm.

: Also, could I lose the typecasts in this code or do I need them?
I''ll comment & review the first function below.
At some point, you do need to (at least implicitly) cast
an operand to the larger type, as this might not happen
automatically for some combinations of types, in C++.

So then in this case a cast is OK, right?

: First version (fastest):
: ---
: void RawDigit::Mul(const RawDigit &a, const RawDigit &b)
: {
: * */* Check to see if we''re doing an in-place multiply */
: * *if(&a == this)
: * *{
: * * *MulInPlace(b);
: * * *return;
: * *} else if(&b == this) {
: * * *MulInPlace(a);
: * * *return;
: * *}
General class design: unless there is an established reason
not to do so, I would use operator overloading and implement
(as members of the Num class):
* static *Num * operator *( const Num& a, const Num& b );
* * * * * Num& *operator *( const Num& a );
Maybe RawDigit::Mul is an internal private member, and the
above operations are provided? *But then the special cases
(e.g. multiply in place) would best be handled in the
layers above...

I''ve tried overlaoded operators but one seems to have to construct
a temporary copy to hold results to return in the case of the binary
operators. This is a big problem for my app. as I have performance
concerns. Deallocating and reallocating memory billions of times
is a serious waste of performance. (These routines will be called
billions of times.)

Therefore I do the opposite and build these low-level routines first,
then build the overloaded operators using them, which can then be
used in all non-performance-critical areas of the program. If it turns
out that I don''t need the MulInPlace() functionality to be integrated
with Mul(), I''ll just remove it and document that Mul() does not
work in-place. Is that acceptable practice?

: * */* Get lengths. */
: * *std::size_t rLength(GetLength());
: * *std::size_t aLength(a.GetLength());
: * *std::size_t bLength(b.GetLength());
:
: * */* Zeroize this. */
: * *Zeroize();

Is the intent truly to implement modulo math, and to allow
the result to be truncated? If not the result should be
resized automatically, and the code is simplified.

So I can omit this functionality and just document that the
routine doesn''t behave as most would expect? (If I have operands
of 3 differing lengths I''d usually expect a Mul() to respect
those lengths but then again your opinion may differ.)

Also the above just does nothing but clear the result buffer,
which you need as we''re going to add (as in arithmetical addition)
numbers to it. You need that for the multiply algorithm to work.

And you need to get the lengths of the input operands because
you''re going to loop over the digits, no? And you don''t want to
read outside the buffer or read too few digits? I think you would,
but then again maybe I''m wrong...

: * */* Do the multiplication. */
: * *TwoDigit64 carry(0);
Technically, the carry should only require 1 digit,
and can be defined within the outer loop below.

But I thought then that would create a performance-wasting conversion
to/from 32/64 again and again.

: * *for(std::size_t i(0);i<aLength && i<rLength;i++)
: * *{
: * * * carry = 0;
:
: * * * for(std::size_t j(0);j<bLength && (i+j)<rLength;j++)
: * * * {
: * * * * *TwoDigit64
: tmp((static_cast<TwoDigit64>(a.digits[i])*b.digits[j]) +
: * * * * * * * * * * * * digits[i+j] + carry);
: * * * * *carry = tmp >DIGIT_BITS;
: * * * * *tmp -= carry << DIGIT_BITS;
: * * * * *digits[i+j] = tmp;
: * * * }
:
: * * * if(i+bLength < rLength)
: * * * * digits[i+bLength] = carry;
: * *}
: }

Let''s say that you have the following digit-related
definitions within your class:
* typedef ... *Digit;
* typedef ... *TwoDigit;
* const unsigned digitBitCount = ...;
* const Digit digitBitMask = 0xF...UL;

Assuming no truncation (because of adequate pre-allocation
of result digits), the same algorithm can be written as:

*for( unsigned aPos = 0 ; aPos<aLength ; ++aPos )
*{
* * Digit carry = 0;
* * TwoDigit const aDig = a.digits[aPos]; //NB: cast "hidden" here
* * for( unsigned bPos = 0 ; bPos<bLength ; ++bPos )
* * {
* * * * TwoDigit mul = aDig * b.digits[bPos]
* * * * * * * * * * *+ this->digits[aPos+bPos]
* * * * * * * * * * *+ carry;

* * * * this->digits[aPos+bPos] = * mul & *digitBitMask;
* * * * carry * * * * * * * * * = ( mul >digitBitCount );
* * }
* * this->digits[aPos+bPos] = carry;
*}

There are many correct ways to write this, and some are
probably better is some ways than this example. *But I
hope that you will find it useful.
In any case, given the very acceptable complexity of this loop,
I would not bother breaking it up or adding layers of
encapsulation.

OK, then.


这篇关于哪个是实现此算法的更好方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆