顶部的铸造双重浮动? [英] Overhead of casting double to float?

查看:108
本文介绍了顶部的铸造双重浮动?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有兆字节的数据存储为双打,需要通过网络发送...现在我不需要双精度提供,所以我想把它们转换为浮点数,然后再通过网络。简单的做什么的开销:

  float myFloat =(float)myDouble; 

我会做这个操作几百万次每几秒钟,不想减慢任何东西下。感谢



编辑:我的平台是Visual Studio 2008的x64。



strong> EDIT 2:我无法控制它们的存储方式。

解决方案

开销很大程度上取决于您的平台,开销绝对小于通过线路发送它们所需的时间。






粗略估算:



800MBit / s的有效载荷在一个优秀的千兆位线上,25M浮点/秒。



在一个2GHz的单核上, / strong>时钟周期为每个值转换为break even - anythign减去,你会节省时间。这在所有架构上应该是足够的:)



一个简单的加载 - 存储周期(禁止所有缓存延迟)应该低于每个值5个周期。通过指令交织,SIMD扩展和/或多核并行化,您可能在单个周期内进行多次转换。



此外,接收器将很高兴只处理一半的数据。请记住,内存访问时间是非线性的。






如果传输应该具有最小的CPU负载:现代架构可以将数据从磁盘/内存传输到总线,而无需CPU干预。



我检查了一些数字,387协处理器确实会占用大约70个周期的加载 - 存储周期。



因此,除非你在386 ...上运行一个千兆网络...


So I have megabytes of data stored as doubles that need to be sent over a network... now I don't need the precision that a double offers, so I want to convert these to a float before sending them over the network. What is the overhead of simply doing:

float myFloat = (float)myDouble;

I'll be doing this operation several million times every few seconds and don't want to slow anything down. Thanks

EDIT: My platform is x64 with Visual Studio 2008.

EDIT 2: I have no control over how they are stored.

解决方案

As Michael Burr said, while the overhead strongly depends on your platform, the overhead is definitely less than the time needed to send them over the wire.


a rough estimate:

800MBit/s payload on a excellent Gigabit wire, 25M-floats/second.

On a 2GHz single core, that gives you a whopping 80 clock cycles for each value converted to break even - anythign less, and you will save time. That should be more than enough on all architectures :)

A simple load-store cycle (barring all caching delays) should be below 5 cycles per value. With instruction interleaving, SIMD extensions and/or parallelizing on multiple cores, you are likely to do multiple conversions in a single cycle.

Also, the receiver will be happy having to handle only half the data. Remember that memory access time is nonlinear.


The only thing arguing against the conversion would be is if the transfer should have minimal CPU load: a modern architecture could transfer the data from disk/memory to bus without CPU intervention. However, with above numbers I'd say that doesn't matter in practice.

[edit]
I checked some numbers, the 387 coprocessor would indeed have taken around 70 cycles for a load-store cycle. On the initial pentium, you are down to 3 cycles without any parallelization.

So, unless you run a gigabit network on a 386...

这篇关于顶部的铸造双重浮动?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆