替换非常慢的 pow() 函数 [英] Replacing extrordinarily slow pow() function

查看:33
本文介绍了替换非常慢的 pow() 函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个 CFD 求解器,在运行模拟时,发现它在某些机器上运行异常缓慢,但在其他机器上却没有.使用 Intel VTune,发现以下行是问题(在 Fortran 中):

We have a CFD solver and while running a simulation, it was found to run extraordinarily slow on some machines but not others. Using Intel VTune, it was found the following line was the problem (in Fortran):

RHOV= RHO_INF*((1.0_wp - COEFF*EXP(F0)))**(1.0_wp/(GAMM - 1.0_wp))

用 VTune 钻取,问题被追踪到 call pow 组装线,当追踪堆栈时,它显示它正在使用 __slowpow().经过一番搜索,这个页面出现了抱怨同样的事情.

Drilling in with VTune, the problem was traced to the call pow assembly line and when tracing the stack, it showed it was using __slowpow(). After some searching, this page showed up complaining about the same thing.

在 libc 版本 2.12 的机器上,模拟耗时 18 秒.在具有 libc 版本 2.14 的机器上,模拟耗时 0 秒.

On the machine with libc version 2.12, the simulation took 18 seconds. On the machine with libc version 2.14, the simulation took 0 seconds.

根据上述页面上的信息,当 pow() 的基数接近 1.0 时会出现问题.所以我们做了另一个简单的测试,我们在 pow() 之前将基数缩放一个任意数字,然后除以 pow() 调用之后的指数..这也将 libc 2.12 的运行时间从 18 秒降至 0 秒.

Based on the information on the aforementioned page, the problem arises when the base to pow() is close to 1.0. So we did another simple test where we scaled the base by an arbitrary number before the pow() and then divided by the number raised to the exponent after the pow() call. This dropped the runtime from 18 seconds to 0 seconds with the libc 2.12 also.

然而,在我们执行 a**b 的所有代码中都使用它是不切实际的.如何替换 libc 中的 pow() 函数?例如,我希望 Fortran 编译器生成的装配线 call pow 调用我们编写的自定义 pow() 函数,该函数执行缩放,调用 libc pow() 然后除以缩放.如何创建对编译器透明的中间层?

However, it's impractical to put this all over the code where we do a**b. How would one go about replacing the pow() function in libc? For instance, I would like the assembly line call pow generated by the Fortran compiler to call a custom pow() function we write that does the scaling, calls the libc pow() and then divides by the scaling. How does one create an intermediate layer transparent to the compiler?

编辑

为了澄清,我们正在寻找类似(伪代码)的东西:

To clarify, we're looking for something like (pseudo-code):

double pow(a,b) {
   a *= 5.0
   tmp = pow_from_libc(a,b)
   return tmp/pow_from_libc(5.0, b)
}

是否可以从 libc 加载 pow 并在我们的自定义函数中重命名以避免命名冲突?如果 customPow.o 文件可以从 libc 重命名 pow,如果其他事情仍然需要 libc 会发生什么?这会导致 customPow.o 中的 pow 和 libc 中的 pow 之间的命名冲突吗?

Is it possible to load the pow from libc and rename it in our custom function to avoid the naming conflicts? If the customPow.o file could rename pow from libc, what happens if libc is still needed for other things? Would that cause a naming conflict between pow in customPow.o and pow in libc?

推荐答案

自己写pow函数,把.o文件放到静态库归档libmypow.a 在链接器的库路径某处,并在链接时传递 -lmypow.

Just write your own pow function, put the .o file in a static library archive libmypow.a somewhere in the linker's library path, and pass -lmypow when linking.

这篇关于替换非常慢的 pow() 函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆