浮动计算慢于双倍。 [英] Computation slow with float than double.

查看:54
本文介绍了浮动计算慢于双倍。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好。


我正在做一些基于红黑高斯赛德尔算法的基准测试,其中2

不同大小和类型的尺寸网格,当我将计算从double更改为float时,我有一些奇怪的结果




以下是使用不同网格SIZE和类型的测试时间:


SIZE 128 256 512

浮动2.20s 2.76s 7.86s


双2.30s 2.47 2.59s


正如您所看到的,当网格的大小为256节点时,具有浮动

类型的代码会大大增加时间。


可能是什么问题?可能是缓存?如果浮动计算总是快于双倍?


希望尽快收到答案,

谢谢


Michele Guidolin。

PS


以下是有关该测试的更多信息:

我正在测试的代码就是这个,对于双

版本来说它是相同的(常量不是0.25f而是0.25)。


-------------代码-------------


浮动你[SIZE] [ SIZE];

float rhs [SIZE] [SIZE];


inline void gs_relax(int i,int j)

{$ / $

u [i] [j] =(rhs [i] [j] +

0.0f * u [i] [j] +

0.25f * u [i + 1] [j] +

0.25f * u [i-1] [j] +

0.25f *你[i] [j + 1] +

0.25f * u [i] [j-1]);

}


void gs_step_fusion()

{

int i,j;


/ *更新红点:< (b =(j = 1; j />
gs_relax(1,j);

}

for(i = 2; I< SIZE-1; i ++)

{

for(j = 1 +(i + 1)%2; j< SIZE-1; j = j + 2)

{

gs_relax(i,j);

gs_relax(i-1,j);

}


}

for(j = 1; j< SIZE-1; j = j + 2)

{

gs_relax(SIZE-2,j);

}


}

---------- -----代码--------------


我正在这台机器上测试这段代码:

处理器:0

vendor_id:GenuineIntel

cpu系列:15

型号:4

型号名称:英特尔(R)奔腾(R)4 CPU 3.20GHz

步进:1

cpu MHz:3192.311

缓存大小:1024 KB

身份证:0

兄弟姐妹:2

fdiv_bug:没有

hlt_bug:没有

f00f_bug:没有

coma_bug:没有

fpu:是

fpu_exception:是

cpuid等级:3

wp:是

flags:fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge

mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni

监视器ds_cpl cid

bogomips:6324.22

在Linux 2.6.8上启用超线程启用。


编译器是gcc 3.4.4,标志是:

CFLAGS = -g -O2 -funroll-loops -msse2 -march = pentium4 -Wall

Hello to everybody.

I''m doing some benchmark about a red black Gauss Seidel algorithm with 2
dimensional grid of different size and type, I have some strange result
when I change the computation from double to float.

Here are the time of test with different grid SIZE and type:

SIZE 128 256 512

float 2.20s 2.76s 7.86s

double 2.30s 2.47s 2.59s

As you can see when the grid has a size of 256 node the code with float
type increase the time drastically.

What could be the problem? could be the cache? Should the float
computation always fastest than double?

Hope to receive an answer as soon as possible,
Thanks

Michele Guidolin.
P.S.

Here are some more information about the test:

The code that I''m testing is this and it is the same for the double
version (the constant are not 0.25f but 0.25).

------------- CODE -------------

float u[SIZE][SIZE];
float rhs[SIZE][SIZE];

inline void gs_relax(int i,int j)
{

u[i][j] = ( rhs[i][j] +
0.0f * u[i][j] +
0.25f* u[i+1][j]+
0.25f* u[i-1][j]+
0.25f* u[i][j+1]+
0.25f* u[i][j-1]);
}

void gs_step_fusion()
{
int i,j;

/* update the red points:
*/

for(j=1; j<SIZE-1; j=j+2)
{
gs_relax(1,j);
}
for(i=2; i<SIZE-1; i++)
{
for(j=1+(i+1)%2; j<SIZE-1; j=j+2)
{
gs_relax(i,j);
gs_relax(i-1,j);
}

}
for(j=1; j<SIZE-1; j=j+2)
{
gs_relax(SIZE-2,j);
}

}
---------------CODE--------------

I''m testing this code on this machine:

processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 3.20GHz
stepping : 1
cpu MHz : 3192.311
cache size : 1024 KB
physical id : 0
siblings : 2
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 3
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni
monitor ds_cpl cid
bogomips : 6324.22

with Hyper threading enable on Linux 2.6.8.

The compiler is gcc 3.4.4 and the flags are:
CFLAGS = -g -O2 -funroll-loops -msse2 -march=pentium4 -Wall

推荐答案

阅读:
我正在做一些关于红黑高斯赛德尔算法的基准测试2
维网格
正如您所看到的,当网格的大小为256节点时,具有浮动类型的代码会大大增加时间。

可能是什么问题?可能是缓存?如果浮点数计算总是快于双倍?
I''m doing some benchmark about a red black Gauss Seidel algorithm with 2
dimensional grid As you can see when the grid has a size of 256 node the code with float
type increase the time drastically.

What could be the problem? could be the cache? Should the float
computation always fastest than double?



很可能你的系统使用

精度更高的所有浮点计算比浮动,然后减少结果,当值必须存储时,这会更频繁地增加表的大小。


- < br $>
a签名



most likely your system does all floating point computations using a
precision greater than float, then reduces the result when the value must
be stored, which happens more often as you increase the size of the table.

--
a signature






Michele Guidolin wrote:
大家好。

我正在做一些关于红色高斯赛德尔算法的基准测试,其中有两个不同大小和类型的尺寸网格,我有一些奇怪的结果
当我改变计算从double到float。

以下是测试时间与不同网格尺寸和类型:

SIZE 128 256 512

浮动2.20s 2.76s 7.86s

双2.30s 2.47s 2.59s

正如你可以看到网格大小为256节点的代码wi浮动
类型大大增加时间。


我看到256的适度增长和512的大幅增加。

有没有转录错误?


我也看到你没有显示的代码可能会占用大部分运行时间的b $ b,这会让人产生怀疑

来自几个实验的结论太多了。

发布代码的运行时间应该增加(大致)

作为SIZE的平方,所以将SIZE从128改为512应该

将其运行时间延长一倍(约)十六。然而

这个假设的16倍增加只增加了0.29秒,而bb的运行时间为双倍。一个简单的计算

(基于未知准确度的数据,可以肯定)表明

该计划的其余部分占89%或更多的时间在这种情况下为
,在另外两种情况下甚至更多。


...如果总时间中有很大一部分存在

在别处,在其他地方的贡献之前得出太多结论

是不明智的。更好地表征,

或更好地控制(例如,通过重复实验和

统计分析)。

可能是什么问题?可能是缓存?浮动计算总是快于双倍?
Hello to everybody.

I''m doing some benchmark about a red black Gauss Seidel algorithm with 2
dimensional grid of different size and type, I have some strange result
when I change the computation from double to float.

Here are the time of test with different grid SIZE and type:

SIZE 128 256 512

float 2.20s 2.76s 7.86s

double 2.30s 2.47s 2.59s

As you can see when the grid has a size of 256 node the code with float
type increase the time drastically.
I see a modest increase at 256 and a huge increase at 512.
Have there been any transcription errors?

I also see that the code you didn''t show probably accounts
for the lion''s share of the running time, which casts suspicion
on drawing too many conclusions from a couple of experiments.
The running time of the posted code should increase (roughly)
as the square of SIZE, so changing SIZE from 128 to 512 should
inflate its running time by a factor of (about) sixteen. Yet
this supposed sixteen-fold increase added only 0.29 seconds to
the running time for "double;" a straightforward calculation
(based on data of unknown accuracy, to be sure) suggests that
the rest of the program accounts for 89% or more of the time
in that case, and even more in the other two.

... and if such a large portion of the total time resides
"elsewhere," it would be unwise to draw too many conclusions
until the contributions of "elsewhere" are better characterized,
or better controlled for (e.g., by repeated experiment and
statistical analysis).
What could be the problem? could be the cache? Should the float
computation always fastest than double?




缓存可能是个问题。因此,可能会在机器上进行对齐或其他的b $ b b竞争过程。如果你正在从一个文件中读取

的初始数据,也许一个测试支付了实际从磁盘读取的
的罚款,而其他测试则受益于

文件系统的缓存。或者磁盘可能刚刚开始上涨,而且O / S在一次测试中间重新定位了整个数据轨道。或者月亮的阶段可能不是b $ b吉利。


浮动总是要快于双倍吗?不,C语言

标准对速度问题保持沉默(这使得整个

讨论偏离主题,或者至少稍微如此)。你已经展示了一些令人费解的数据,但是你需要更多的数据和更多的分析才能得出好的结论,而你最终得到的结果将是

最有可能只与您使用它们的系统相关,并且

不是C语言。我会建议进一步的实验,并且将
a更改为专门用于您系统的新闻组,专家在您系统的怪癖中挂掉了



-
Er ********* @ sun .com



Cache might be a problem. So might alignment, or other
competing processes on the machine. If you''re reading the
initial data from a file, perhaps one test paid the penalty of
actually reading from the disk while the others benefitted from
the file system''s cache. Or maybe the disk is just beginning
to go sour, and the O/S relocated an entire track of data in
the middle of one test. Or maybe the phase of the moon wasn''t
propitious.

Should float always be faster than double? No, the C language
Standard is silent on matters of speed (which makes the entire
discussion off-topic here, or at least slightly so). You''ve shown
some puzzling data, but you need more data and more analysis to
draw good conclusions, and the results you eventually get will
most likely be relevant only to the system you got them on, and
not to the C language. I''d suggest further experimentation, and
a change to a newsgroup devoted to your system, where the experts
on your system''s quirks hang out.

--
Er*********@sun.com


Michele Guidolin写道:
Michele Guidolin wrote:

.... snip ... <以下是使用不同网格SIZE和类型进行测试的时间:

SIZE 128 256 512
浮动2.20s 2.76s 7.86s
双2.30s 2.47s 2.59s

正如您所看到的,当网格大小为256节点时,具有浮动类型的代码会大大增加时间。

可能是什么问题?可能是缓存?浮点数计算总是比双倍快吗?
.... snip ...
Here are the time of test with different grid SIZE and type:

SIZE 128 256 512
float 2.20s 2.76s 7.86s
double 2.30s 2.47s 2.59s

As you can see when the grid has a size of 256 node the code with
float type increase the time drastically.

What could be the problem? could be the cache? Should the float
computation always fastest than double?




导致浮点运算 - >双重>浮点转换完成。这些是耗费时间的



-

"如果你想通过groups.google发布后续内容.com,不要使用

破碎的回复链接在文章的底部。点击

" show options"在文章的顶部,然后点击

回复在文章标题的底部。 - Keith Thompson



C real computations are always done as doubles by default. When
you specify floats you are primarily constricting the storage, and
are causing float->double->float conversions to be done. These are
eating up the time.

--
"If you want to post a followup via groups.google.com, don''t use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson


这篇关于浮动计算慢于双倍。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆