CPU相关代码:如何避免使用函数指针? [英] CPU dependent code: how to avoid function pointers?

查看:84
本文介绍了CPU相关代码:如何避免使用函数指针?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有为多个CPU编写的性能关键代码.我在运行时检测CPU,并基于此对检测到的CPU使用适当的功能.因此,现在我必须使用函数指针并使用这些函数指针来调用函数:

I have performance critical code written for multiple CPUs. I detect CPU at run-time and based on that I use appropriate function for the detected CPU. So, now I have to use function pointers and call functions using these function pointers:

void do_something_neon(void);
void do_something_armv6(void);

void (*do_something)(void);

if(cpu == NEON) {
    do_something = do_something_neon;
}else{
    do_something = do_something_armv6;
}

//Use function pointer:
do_something(); 
...

没关系,但是我要提到我已经针对不同的CPU优化了功能:具有NEON支持的armv6和armv7.问题在于,通过在许多地方使用函数指针,代码会变慢,我想避免该问题.

Not that it matters, but I'll mention that I have optimized functions for different cpu's: armv6 and armv7 with NEON support. The problem is that by using function pointers in many places the code become slower and I'd like to avoid that problem.

基本上,在加载时,链接器解析重定位并使用功能地址修补代码.有没有办法更好地控制这种行为?

Basically, at load time linker resolves relocs and patches code with function addresses. Is there a way to control better that behavior?

就个人而言,我将提出两种避免函数指针的方法:为cpu依赖函数创建两个单独的.so(或.dll),将它们放置在不同的文件夹中,并根据检测到的CPU将其中一个文件夹添加到搜索中路径(或LD_LIB_PATH).加载主代码和动态链接器将从搜索路径中提取所需的dll.另一种方法是编译库的两个单独副本:) 第一种方法的缺点是它迫使我至少拥有3个共享库(dll):两个用于cpu相关函数,一个用于使用它们的主代码.我需要3,因为在加载使用这些cpu依赖功能的代码之前,我必须能够进行CPU检测.关于第一种方法的优点在于,该应用程序无需为多个CPU加载同一代码的多个副本,它只会加载将要使用的副本.第二种方法的缺点非常明显,无需赘述.

Personally, I'd propose two different ways to avoid function pointers: create two separate .so (or .dll) for cpu dependent functions, place them in different folders and based on detected CPU add one of these folders to the search path (or LD_LIB_PATH). The, load main code and dynamic linker will pick up required dll from the search path. The other way is to compile two separate copies of library :) The drawback of the first method is that it forces me to have at least 3 shared objects (dll's): two for the cpu dependent functions and one for the main code that uses them. I need 3 because I have to be able to do CPU detection before loading code that uses these cpu dependent functions. The good part about the first method is that the app won't need to load multiple copies of the same code for multiple CPUs, it will load only the copy that will be used. The drawback of the second method is quite obvious, no need to talk about it.

我想知道是否有一种方法可以不使用共享对象并在运行时手动加载它们.其中一种方式可能是涉及在运行时修补代码的黑客行为,可能太复杂而无法正确完成).在加载时是否有更好的方法来控制重定位?也许将cpu依赖函数放在不同的部分中,然后以某种方式指定哪个部分具有优先级?我认为MAC的男子气概格式就是这样.

I'd like to know if there is a way to do that without using shared objects and manually loading them at runtime. One of the ways would be some hackery that involves patching code at run-time, it's probably too complicated to get it done properly). Is there a better way to control relocations at load time? Maybe place cpu dependent functions in different sections and then somehow specify what section has priority? I think MAC's macho format has something like that.

仅ELF(用于手臂目标)解决方案对我来说就足够了,我不太在乎PE(dll).

ELF-only (for arm target) solution is enough for me, I don't really care for PE (dll's).

谢谢

推荐答案

以下是我正在寻找的确切答案.

Here's the exact answer that I was looking for.

 GCC's __attribute__((ifunc("resolver")))

它需要相当新的binutils.
有一篇很好的文章介绍了此扩展: Gnu对CPU调度的支持- ...

It requires fairly recent binutils.
There's a good article that describes this extension: Gnu support for CPU dispatching - sort of...

这篇关于CPU相关代码:如何避免使用函数指针?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆