C ++ 11 hypot比标准函数和#define慢 [英] C++11 hypot slower than standard function and #define
问题描述
我使用VS2013对我的E5 -1620 v3 CPU进行了测试,并测试了标准C ++ 11功能缩放比使用函数或#define函数花费的时间多140%。我还给了编译器在第二轮测试中并行化微积分的机会。
结果我得到了:
C ++ 11 hypot:55 MOPs(mega operations)每秒)
_hypo功能:81-82 MOPs
#define func。 :81-83 MOPs
每个平行10次操作:
C ++ 11 hypot:152 MOPs
_hypo功能:268 MOPs
#define func。 :240 MOPs
至少三个功能的结果完全相同。
是吗VS的问题还是可能在linux + gcc系统上发生?
它也可能发生在AMD ryzen上吗?
更新:
我在Linux中使用旧的慢速AMD处理器进行了测试:
并行化的功能取得了以下结果:
C ++ 11 hypot:加倍速度
_hypo功能:30%速度incrtease
#define功能:3x速度(比_hypo加倍)
我尝试过:
这个新版本使用< chronos>并且在linux和AMD系统中也有效:
I made tests on my E5 -1620 v3 CPU using VS2013 and tested that the standard C++11 function hypot takes a 140% more time to finish than using a function or a #define function. I also gave an opportunity to the compiler to parallelize the calculus in a second round test.
As result I obtained:
C++11 hypot: 55 MOPs (mega operations per second)
_hypo function: 81-82 MOPs
#define func. : 81-83 MOPs
Parallelized 10 operations each:
C++11 hypot: 152 MOPs
_hypo function: 268 MOPs
#define func. : 240 MOPs
At least the results of the three functions where exactly the same.
Is that a problem of the VS or it could happens also on a linux+gcc system?
It could happens on a AMD ryzen also?
UPDATED:
I tested also in linux with old slow AMD processor:
The parallelized functions made following results:
C++11 hypot: doubled the speed
_hypo function: 30% speed incrtease
#define function: 3x speed (double than the _hypo)
What I have tried:
This new version uses <chronos> and works in linux and AMD system also:
#include <iostream>
#include <math.h>
#include <chrono>
#ifdef __linux
#include <unistd.h>
#else
#pragma warning(disable:4996) //disable deprecateds
#endif
using namespace std;
typedef unsigned char uchar;
/*
time_t start,stop;char null_char='\0';
//Use empty timer() to reset start time:
void timer(char *title=&null_char,int data_size=1){ stop=clock(); if (*title) printf("%s time = %7lg = %7lg MOPs\n",title,(double) (stop-start)/(double) CLOCKS_PER_SEC, 1e-6*data_size/( (double)(stop-start)/(double)CLOCKS_PER_SEC ) ); start=clock(); }
*/
auto start_time=chrono::system_clock::now(),stop_time=start_time;char null_char='\0';
void timer(char *title=&null_char,int data_size=1){ stop_time= chrono::system_clock::now();double us=(double) chrono::duration_cast<chrono::microseconds>(stop_time - start_time).count(); if (*title) printf("%s time = %7lgms = %7lg MOPs\n",title,(double) us*1e-3, (double)data_size/us); start_time= chrono::system_clock::now(); }
double _hypo(double x,double y)
{
x*=x;y*=y;return sqrt(x+y);//quicker than other two
//return sqrt(x*x+y*y);
//x=sqrt(x*x+y*y);return x;
}
#define _HYPO(x,y) (sqrt((x)*(x)+(y)*(y)))
int main()
{
int N=10000000;double x,y=12.22;
timer();
for (int i=0;i<N;i++)
{
x=(double) i+1;
y=hypot(x,y);y=hypot(x,y);
y=hypot(x,y);y=hypot(x,y);
y=hypot(x,y);y=hypot(x,y);
y=hypot(x,y);y=hypot(x,y);
y=hypot(x,y);y=hypot(x,y);
}
timer("hypot ",10*N);
for (int i=0;i<N;i++)
{
x=(double) i+1;
y=_hypo(x,y);y=_hypo(x,y);
y=_hypo(x,y);y=_hypo(x,y);
y=_hypo(x,y);y=_hypo(x,y);
y=_hypo(x,y);y=_hypo(x,y);
y=_hypo(x,y);y=_hypo(x,y);
}
timer("_hypo ",10*N);
for (int i=0;i<N;i++)
{
x=(double) i+1;
y=_HYPO(x,y);y=_HYPO(x,y);
y=_HYPO(x,y);y=_HYPO(x,y);
y=_HYPO(x,y);y=_HYPO(x,y);
y=_HYPO(x,y);y=_HYPO(x,y);
y=_HYPO(x,y);y=_HYPO(x,y);
}
timer("#define ",10*N);
//Following tests gives an opportunity to be optimized by the compiler&linker:
for (int i=0;i<N;i++)
{
x=(double) i+1;
y=hypot(x,y)+hypot(0.5*x,y)+hypot(0.4*x,y)+hypot(0.3*x,y)+hypot(0.2*x,y)+
hypot(x,y)+hypot(0.5*x,y)+hypot(0.4*x,y)+hypot(0.3*x,y)+hypot(0.2*x,y);
}
timer("hypot par ",10*N);
for (int i=0;i<N;i++)
{
x=(double) i+1;
y=_hypo(x,y) +_hypo(0.5 *x,y)+_hypo(0.4 *x,y)+_hypo(0.3 *x,y)+_hypo(0.2 *x,y)+
_hypo(x*0.11,y)+_hypo(0.13*x,y)+_hypo(0.41*x,y)+_hypo(0.31*x,y)+_hypo(0.23*x,y);
}
timer("_hypo par ",10*N);
for (int i=0;i<N;i++)
{
x=(double) i+1;
y=_HYPO(x,y) +_HYPO(0.5 *x,y)+_HYPO(0.4 *x,y)+_HYPO(0.3 *x,y)+_HYPO(0.2 *x,y)+
_HYPO(x*0.11,y)+_HYPO(0.13*x,y)+_HYPO(0.41*x,y)+_HYPO(0.31*x,y)+_HYPO(0.23*x,y);
}
timer("_HYPO par ",10*N);
//Check for errors:
x=1.12345;y=y+=.77777732;
if ((hypot(x,y)!=_hypo(x,y))||(hypot(x,y)!=_HYPO(x,y)))
cout<<"ERROR: "<<hypot(x,y)<<"!="<<_hypo(x,y) <<" or "<<hypot(x,y)<<"!="<<_HYPO(x,y)<<endl;
else cout<<"hypot(x,y)==_hypo(x,y)==HYPO(x,y)"<<endl;
cout<<"===END==="<<endl;getchar();
}
推荐答案
标准库数学函数执行参数检查(INF,NaN)和结果检查(溢出/下溢hypot
)符合IEEE7541003 / ISO C标准。
The standard library math functions perform parameter checking (INF, NaN) and result checking (overflow / underflow forhypot
) to be IEEE7541003 / ISO C compliant.
如果正确的值会导致溢出,则会发生范围错误,hypot(),hypotf()和hypotl()将分别返回宏HUGE_VAL,HUGE_VALF和HUGE_VALL的值。
如果x或y为±Inf,则返回+ Inf(即使x或y中的一个是NaN)。
如果x或y是NaN,而另一个不是±Inf,则返回NaN。
如果两个参数都是次正规且正确的结果是低于正常的,可能会出现范围错误并返回正确的结果。
If the correct value would cause overflow, a range error shall occur and hypot(), hypotf(), and hypotl() shall return the value of the macro HUGE_VAL, HUGE_VALF, and HUGE_VALL, respectively.
If x or y is ±Inf, +Inf shall be returned (even if one of x or y is NaN).
If x or y is NaN, and the other is not ±Inf, a NaN shall be returned.
If both arguments are subnormal and the correct result is subnormal, a range error may occur and the correct result is returned.
如果将这些检查添加到函数中,它将消耗与库函数相同甚至更多的时间。
[/ EDIT]
If you add these checks to your function, it will consume the same - or even more - time than the library function.
[/EDIT]
这篇关于C ++ 11 hypot比标准函数和#define慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!