C ++ 11 hypot比标准函数和#define慢 [英] C++11 hypot slower than standard function and #define

查看:135
本文介绍了C ++ 11 hypot比标准函数和#define慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用VS2013对我的E5 -1620 v3 CPU进行了测试,并测试了标准C ++ 11功能缩放比使用函数或#define函数花费的时间多140%。我还给了编译器在第二轮测试中并行化微积分的机会。

结果我得到了:

C ++ 11 hypot:55 MOPs(mega operations)每秒)

_hypo功能:81-82 MOPs

#define func。 :81-83 MOPs



每个平行10次操作:

C ++ 11 hypot:152 MOPs

_hypo功能:268 MOPs

#define func。 :240 MOPs



至少三个功能的结果完全相同。



是吗VS的问题还是可能在linux + gcc系统上发生?

它也可能发生在AMD ryzen上吗?



更新:

我在Linux中使用旧的慢速AMD处理器进行了测试:

并行化的功能取得了以下结果:

C ++ 11 hypot:加倍速度

_hypo功能:30%速度incrtease

#define功能:3x速度(比_hypo加倍)



我尝试过:



这个新版本使用< chronos>并且在linux和AMD系统中也有效:

I made tests on my E5 -1620 v3 CPU using VS2013 and tested that the standard C++11 function hypot takes a 140% more time to finish than using a function or a #define function. I also gave an opportunity to the compiler to parallelize the calculus in a second round test.
As result I obtained:
C++11 hypot: 55 MOPs (mega operations per second)
_hypo function: 81-82 MOPs
#define func. : 81-83 MOPs

Parallelized 10 operations each:
C++11 hypot: 152 MOPs
_hypo function: 268 MOPs
#define func. : 240 MOPs

At least the results of the three functions where exactly the same.

Is that a problem of the VS or it could happens also on a linux+gcc system?
It could happens on a AMD ryzen also?

UPDATED:
I tested also in linux with old slow AMD processor:
The parallelized functions made following results:
C++11 hypot: doubled the speed
_hypo function: 30% speed incrtease
#define function: 3x speed (double than the _hypo)

What I have tried:

This new version uses <chronos> and works in linux and AMD system also:

#include <iostream>
#include <math.h>
#include <chrono>
#ifdef __linux
#include <unistd.h>
#else
#pragma warning(disable:4996) //disable deprecateds
#endif
using namespace std;
typedef unsigned char uchar;

/*
time_t start,stop;char null_char='\0';
//Use empty timer() to reset start time:
void timer(char *title=&null_char,int data_size=1){    	stop=clock();	if (*title) printf("%s time = %7lg = %7lg MOPs\n",title,(double) (stop-start)/(double) CLOCKS_PER_SEC, 1e-6*data_size/( (double)(stop-start)/(double)CLOCKS_PER_SEC ) ); 	start=clock(); }
*/
auto start_time=chrono::system_clock::now(),stop_time=start_time;char null_char='\0';
void timer(char *title=&null_char,int data_size=1){    	stop_time= chrono::system_clock::now();double us=(double) chrono::duration_cast<chrono::microseconds>(stop_time - start_time).count();	if (*title) printf("%s time = %7lgms = %7lg MOPs\n",title,(double) us*1e-3, (double)data_size/us); start_time= chrono::system_clock::now(); }

double _hypo(double x,double y)
{
	x*=x;y*=y;return sqrt(x+y);//quicker than other two 
	//return sqrt(x*x+y*y);
	//x=sqrt(x*x+y*y);return x;
}

#define _HYPO(x,y) (sqrt((x)*(x)+(y)*(y)))

int main()
{
	int N=10000000;double x,y=12.22;

	timer();
	for (int i=0;i<N;i++)
	{
		x=(double) i+1;
		y=hypot(x,y);y=hypot(x,y);
		y=hypot(x,y);y=hypot(x,y);
		y=hypot(x,y);y=hypot(x,y);
		y=hypot(x,y);y=hypot(x,y);
		y=hypot(x,y);y=hypot(x,y);
	}
	timer("hypot     ",10*N);
	for (int i=0;i<N;i++)
	{
		x=(double) i+1;
		y=_hypo(x,y);y=_hypo(x,y);
		y=_hypo(x,y);y=_hypo(x,y);
		y=_hypo(x,y);y=_hypo(x,y);
		y=_hypo(x,y);y=_hypo(x,y);
		y=_hypo(x,y);y=_hypo(x,y);
	}
	timer("_hypo     ",10*N);
	for (int i=0;i<N;i++)
	{
		x=(double) i+1;
		y=_HYPO(x,y);y=_HYPO(x,y);
		y=_HYPO(x,y);y=_HYPO(x,y);
		y=_HYPO(x,y);y=_HYPO(x,y);
		y=_HYPO(x,y);y=_HYPO(x,y);
		y=_HYPO(x,y);y=_HYPO(x,y);
	}
	timer("#define   ",10*N);

	//Following tests gives an opportunity to be optimized by the compiler&linker:
	for (int i=0;i<N;i++)
	{
		x=(double) i+1;
		y=hypot(x,y)+hypot(0.5*x,y)+hypot(0.4*x,y)+hypot(0.3*x,y)+hypot(0.2*x,y)+
		  hypot(x,y)+hypot(0.5*x,y)+hypot(0.4*x,y)+hypot(0.3*x,y)+hypot(0.2*x,y);
	}
	timer("hypot par ",10*N);
	for (int i=0;i<N;i++)
	{
		x=(double) i+1;
		y=_hypo(x,y)     +_hypo(0.5 *x,y)+_hypo(0.4 *x,y)+_hypo(0.3 *x,y)+_hypo(0.2 *x,y)+
		  _hypo(x*0.11,y)+_hypo(0.13*x,y)+_hypo(0.41*x,y)+_hypo(0.31*x,y)+_hypo(0.23*x,y);
	}
	timer("_hypo par ",10*N);
	for (int i=0;i<N;i++)
	{
		x=(double) i+1;
		y=_HYPO(x,y)     +_HYPO(0.5 *x,y)+_HYPO(0.4 *x,y)+_HYPO(0.3 *x,y)+_HYPO(0.2 *x,y)+
		  _HYPO(x*0.11,y)+_HYPO(0.13*x,y)+_HYPO(0.41*x,y)+_HYPO(0.31*x,y)+_HYPO(0.23*x,y);
	}
	timer("_HYPO par ",10*N);


	//Check for errors:
	x=1.12345;y=y+=.77777732;
	if ((hypot(x,y)!=_hypo(x,y))||(hypot(x,y)!=_HYPO(x,y)))
		cout<<"ERROR: "<<hypot(x,y)<<"!="<<_hypo(x,y) <<" or "<<hypot(x,y)<<"!="<<_HYPO(x,y)<<endl;
	else cout<<"hypot(x,y)==_hypo(x,y)==HYPO(x,y)"<<endl; 

	cout<<"===END==="<<endl;getchar();
}

推荐答案

标准库数学函数执行参数检查(INF,NaN)和结果检查(溢出/下溢 hypot )符合IEEE 754 1003 / ISO C标准。





The standard library math functions perform parameter checking (INF, NaN) and result checking (overflow / underflow for hypot) to be IEEE 754 1003 / ISO C compliant.


如果正确的值会导致溢出,则会发生范围错误,hypot(),hypotf()和hypotl()将分别返回宏HUGE_VAL,HUGE_VALF和HUGE_VALL的值。



如果x或y为±Inf,则返回+ Inf(即使x或y中的一个是NaN)。



如果x或y是NaN,而另一个不是±Inf,则返回NaN。



如果两个参数都是次正规且正确的结果是低于正常的,可能会出现范围错误并返回正确的结果。

If the correct value would cause overflow, a range error shall occur and hypot(), hypotf(), and hypotl() shall return the value of the macro HUGE_VAL, HUGE_VALF, and HUGE_VALL, respectively.

If x or y is ±Inf, +Inf shall be returned (even if one of x or y is NaN).

If x or y is NaN, and the other is not ±Inf, a NaN shall be returned.

If both arguments are subnormal and the correct result is subnormal, a range error may occur and the correct result is returned.

如果将这些检查添加到函数中,它将消耗与库函数相同甚至更多的时间。

[/ EDIT]

If you add these checks to your function, it will consume the same - or even more - time than the library function.
[/EDIT]


这篇关于C ++ 11 hypot比标准函数和#define慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆