CPU 相关代码:如何避免函数指针? [英] CPU dependent code: how to avoid function pointers?

查看:16
本文介绍了CPU 相关代码:如何避免函数指针?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为多个 CPU 编写了性能关键代码.我在运行时检测 CPU,并基于此为检测到的 CPU 使用适当的函数.所以,现在我必须使用函数指针并使用这些函数指针调用函数:

I have performance critical code written for multiple CPUs. I detect CPU at run-time and based on that I use appropriate function for the detected CPU. So, now I have to use function pointers and call functions using these function pointers:

void do_something_neon(void);
void do_something_armv6(void);

void (*do_something)(void);

if(cpu == NEON) {
    do_something = do_something_neon;
}else{
    do_something = do_something_armv6;
}

//Use function pointer:
do_something(); 
...

这并不重要,但我会提到我为不同的 CPU 优化了功能:支持 NEON 的 armv6 和 armv7.问题是在很多地方使用函数指针会使代码变慢,我想避免这个问题.

Not that it matters, but I'll mention that I have optimized functions for different cpu's: armv6 and armv7 with NEON support. The problem is that by using function pointers in many places the code become slower and I'd like to avoid that problem.

基本上,在加载时链接器使用函数地址解析重定位和补丁代码.有没有办法更好地控制这种行为?

Basically, at load time linker resolves relocs and patches code with function addresses. Is there a way to control better that behavior?

就我个人而言,我建议使用两种不同的方法来避免函数指针:为 cpu 相关函数创建两个单独的 .so(或 .dll),将它们放在不同的文件夹中,并根据检测到的 CPU 将这些文件夹之一添加到搜索中路径(或 LD_LIB_PATH).加载主代码和动态链接器将从搜索路径中选取所需的 dll.另一种方法是编译两个单独的库副本:)第一种方法的缺点是它迫使我至少拥有 3 个共享对象(dll):两个用于 cpu 相关函数,一个用于使用它们的主代码.我需要 3,因为我必须能够在加载使用这些 cpu 相关函数的代码之前进行 CPU 检测.第一种方法的好处是应用程序不需要为多个 CPU 加载相同代码的多个副本,它只会加载将要使用的副本.第二种方法的缺点比较明显,不用多说.

Personally, I'd propose two different ways to avoid function pointers: create two separate .so (or .dll) for cpu dependent functions, place them in different folders and based on detected CPU add one of these folders to the search path (or LD_LIB_PATH). The, load main code and dynamic linker will pick up required dll from the search path. The other way is to compile two separate copies of library :) The drawback of the first method is that it forces me to have at least 3 shared objects (dll's): two for the cpu dependent functions and one for the main code that uses them. I need 3 because I have to be able to do CPU detection before loading code that uses these cpu dependent functions. The good part about the first method is that the app won't need to load multiple copies of the same code for multiple CPUs, it will load only the copy that will be used. The drawback of the second method is quite obvious, no need to talk about it.

我想知道是否有办法在不使用共享对象并在运行时手动加载它们的情况下做到这一点.其中一种方法是一些涉及在运行时修补代码的hackery,它可能太复杂而无法正确完成).有没有更好的方法来控制加载时的重定位?也许将 cpu 相关函数放在不同的部分,然后以某种方式指定哪个部分具有优先级?我认为 MAC 的 macho 格式有类似的东西.

I'd like to know if there is a way to do that without using shared objects and manually loading them at runtime. One of the ways would be some hackery that involves patching code at run-time, it's probably too complicated to get it done properly). Is there a better way to control relocations at load time? Maybe place cpu dependent functions in different sections and then somehow specify what section has priority? I think MAC's macho format has something like that.

仅 ELF(用于 arm 目标)解决方案对我来说就足够了,我并不真正关心 PE(dll).

ELF-only (for arm target) solution is enough for me, I don't really care for PE (dll's).

谢谢

推荐答案

这是我一直在寻找的确切答案.

Here's the exact answer that I was looking for.

 GCC's __attribute__((ifunc("resolver")))

它需要相当新的 binutils.
有一篇很好的文章描述了这个扩展:Gnu support for CPU dispatching - sort of...

It requires fairly recent binutils.
There's a good article that describes this extension: Gnu support for CPU dispatching - sort of...

这篇关于CPU 相关代码:如何避免函数指针?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆