是否可以从可执行文件中删除类型名称,同时保持RTTI启用? [英] Is it possible to strip type names from executable while keeping RTTI enabled?

查看:182
本文介绍了是否可以从可执行文件中删除类型名称,同时保持RTTI启用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在我的编译器(MSVC10)禁用RTTI,可执行文件大小明显减少。通过使用文本编辑器比较生成的可执行文件,我发现无RTTI版本包含少得多的符号名称,解释了保存的空间。

I recently disabled RTTI on my compiler (MSVC10) and the executable size decreased significantly. By comparing the produced executables using a text editor, I found that the RTTI-less version contains much less symbol names, explaining the saved space.

AFAIK,这些符号名称只用于填充与每个多态类型相关的 type_info 结构,并且可以通过编程访问它们调用 type_info :: name()

AFAIK, those symbol names are only used to fill the type_info structure associated with each the polymorphic type, and one can programmatically access them calling type_info::name().

根据标准, type_info :: name()返回的字符串的格式未指定。也就是说,没有人可以依靠它来做可靠的严肃的事情。所以,实现应该可能总是返回一个空字符串而不破坏任何东西,从而减少可执行文件大小,而不禁用RTTI支持(所以我们仍然可以使用 typeid 运算符&比较 type_info 的对象安全)。

According to the standard, the format of the string returned by type_info::name() is unspecified. That is, no one can rely one it to do serious things portably. So, it should be possible for an implementation to always return an empty string without breaking anything, thus reducing the executable size without disabling RTTI support (so we can still use the typeid operator & compare type_info's objects safely).

但是...是可能吗?我使用MSVC10,我没有找到任何选项。我可以完全禁用RTTI( / GR - ),或者使用完整的类型名称( / GR )。

But... is it possible ? I'm using MSVC10 and I've not found any option to do that. I can either disable completely RTTI (/GR-), or enable it with full type names (/GR). Does any compiler provide such an option?

推荐答案

这里提出了三个不同的问题。

You're asking three different questions here.


  1. 初始问题询问是否有任何方法让MSVC不生成名称,或者是否可能与其他编译器,或者,如果没有,是否有任何方式

  1. The initial question asks whether there's any way to get MSVC to not generate names, or whether it's possible with other compilers, or, failing that, whether there's any way to strip the names out of the generated type_info without breaking things.

然后你想知道是否可以修改MS ABI(可能不是太根本上),以便可以剥离名称。

Then you want to know whether it would be possible to modify the MS ABI (presumably not too radically) so that it would be possible to strip the names.

最后,你想知道是否可以设计一个ABI

Finally, you want to know whether it would be possible to design an ABI that didn't have names.

问题#1本身是一个复杂的问题。据我所知,没有办法让MSVC不生成名称。而大多数其他编译器的目标是ABIs,它们明确定义了什么typeid(foo).name()必须返回,所以它们也不能不生成名字。

Question #1 is itself a complex question. As far as I know, there's no way to get MSVC to not generate names. And most other compilers are aimed at ABIs that specifically define what typeid(foo).name() must return, so they also can't be made to not generate names.

更有趣的问题是,如果你删除名称会发生​​什么。对于MSVC,我不知道答案。在这里做的最好的事情可能是尝试它 - 进入你的DLL,并将每个名称的第一个字符更改为\0,看看是否打破了dynamic_cast等(我知道你可以做到这一点与Mac和linux x86_64可以通过g ++ 4.2生成的可执行文件和它的工作,但让我们暂时搁置。)

The more interesting question is, what happens if you strip out the names. For MSVC, I don't know the answer. The best thing to do here is probably to try it—go into your DLLs and change the first character of each name to \0 and see if it breaks dynamic_cast, etc. (I know that you can do this with Mac and linux x86_64 executables generated by g++ 4.2 and it works, but let's put that aside for now.)

问题#2,假设空白的名称不工作,很难修改基于名称的系统,不再需要名称。一个简单的解决方案是使用名称的哈希,或甚至ROT13编码的名称(记住,这里的原始目标是我不想让临时用户看到我的类的尴尬的名称)。但我不知道,这将算是你要找的。一个稍微更复杂的解决方案如下:

On to question #2, assuming blanking the names doesn't work, it wouldn't be that hard to modify a name-based system to no longer require names. One trivial solution is to use hashes of the names, or even ROT13-encoded names (remember that the original goal here is "I don't want casual users to see the embarrassing names of my classes"). But I'm not sure that would count for what you're looking for. A slightly more complex solution is as follows:


  • 对于dllexported类,生成一个UUID,放在typeinfo中,

  • 对于dllimported类,从.LIB中读取UUID,然后使用它。

所以,如果你设法获得dllexport / dllimport权限,它会工作,因为你的exe将使用与dll相同的UUID。但是如果你不这样做呢?如果你在DLL和你的EXE中意外地指定了相同的类(例如,使用相同的参数实例化相同的模板),而不将它标记为dllexport和一个作为dllimport? RTTI不会将它们视为同一类型。

So, if you manage to get the dllexport/dllimport right, it will work, because your exe will be using the same UUID as the dll. But what if you don't? What if you "accidentally" specify identical classes (e.g., an instantiation of the same template with the same parameters) in your DLL and your EXE, without marking one as dllexport and one as dllimport? RTTI won't see them as the same type.

这是一个问题吗?嗯,C ++标准不说它是。也没有任何MS文档。事实上,文档明确说明你不允许这样做。您不能在两个不同的模块中使用相同的类或函数,除非您明确地从一个模块导出它并将其导入另一个。事实上,这是很难做的类模板是一个问题,这是一个问题,他们不尝试解决。

Is this a problem? Well, the C++ standard doesn't say it is. And neither does any MS documentation. In fact, the documentation explicitly says that you're not allowed to do this. You cannot use the same class or function in two different modules unless you explicitly export it from one module and import it into another. The fact that this is very hard to do with class templates is a problem, and it's a problem they don't try to solve.

让我们举一个现实的例子:Create基于节点的 linkedlist 具有全局静态哨兵的类模板,其中每个列表的最后一个节点指向该哨兵,end()函数只返回一个指向它的指针。 (微软自己实现的std :: map用于这样做;我不知道如果这仍然是真的。)在你的exe中新建一个 linkedlist< int> 并通过引用您的dll中的一个函数,尝试从 l.begin() l.end()。它永远不会完成,因为没有一个由exe创建的节点将指向din中的哨兵的副本。当然,如果你把 l.begin() l.end()传入DLL, code> l 本身,你不会有这个问题。你通常可以通过引用传递 std :: string 或其他各种类型,只是因为它们不依赖于打破的任何东西。但是你实际上不允许这样做,你只是幸运。因此,当用在链接时必须查找的UUID替换名称意味着类型在链接加载器时间不能匹配,类型已经不能在链接加载器时间匹配的事实表示这是不相关的。

Let's take a realistic example: Create a node-based linkedlist class template with a global static sentinel, where every list's last node points to that sentinel, and the end() function just returns a pointer to it. (Microsoft's own implementation of std::map used to do exactly this; I'm not sure if that's still true.) New up a linkedlist<int> in your exe, and pass it by reference to a function in your dll that tries to iterate from l.begin() to l.end(). It will never finish, because none of the nodes created by the exe will point to the copy of the sentinel in the dll. Of course if you pass l.begin() and l.end() into the DLL, instead of passing l itself, you won't have this problem. You can usually get away with passing a std::string or various other types by reference, just because they don't depend on anything that breaks. But you're not actually allowed to do so, you're just getting lucky. So, while replacing the names with UUIDs that have to be looked up at link time means types can't be matched up at link-loader time, the fact that types already can't be matched up at link-loader time means this is irrelevant.

这将是可能建立一个没有这些问题的基于名称的系统。 ARM C ++ ABI (以及基于它的iOS和Android ABI)限制了程序员可以远远少于MS的程序员,并且对链接加载器如何使其工作有非常具体的要求(3.2.5)。这不能修改为不是基于名称的,因为它是设计中的一个明确选择:

It would be possible to build a name-based system that didn't have these problems. The ARM C++ ABI (and the iOS and Android ABIs based on it) restricts what programmers can get away with much less than MS, and has very specific requirements on how the link-loader has to make it work (3.2.5). This one couldn't be modified to not be name-based because it was an explicit choice in the design that:


•type_info :: operator ==和type_info :: operator!=比较由type_info :: name()返回的字符串,而不仅仅是指向RTTI对象及其名称的指针。

• type_info::operator== and type_info::operator!= compare the strings returned by type_info::name(), not just the pointers to the RTTI objects and their names.

不依赖于type_info :: name()返回的地址。 (也就是说,t1.name()!= t2.name()并不意味着t1!= t2)。

• No reliance is placed on the address returned by type_info::name(). (That is, t1.name() != t2.name() does not imply that t1 != t2).

第一个条件有效地要求这些操作符(和type_info :: before())必须脱机调用,并且执行环境必须提供适当的实现。

The first condition effectively requires that these operators (and type_info::before()) must be called out of line, and that the execution environment must provide appropriate implementations of them.

但是也可以构建一个没有这个问题并且不使用名称的ABI。这很适合#3。

But it's also possible to build an ABI that doesn't have this problem and that doesn't use names. Which segues nicely to #3.

Itanium ABI (由OS X和最近在x86_64和i386上的linux使用)确保在一个对象中生成 linkedlist< int> 并且从另一对象中的相同头部生成的 linkedlist< int> 可以在运行时被链接在一起,并且将是相同的类型,这意味着它们必须具有相等的type_info对象。从2.9.1开始:

The Itanium ABI (used by, among other things, both OS X and recent linux on x86_64 and i386) does guarantee that a linkedlist<int> generated in one object and a linkedlist<int> generated from the same header in another object can be linked together at runtime and will be the same type, which means they must have equal type_info objects. From 2.9.1:


当且仅当指针相等时,两个type_info指针才指向等价的类型描述。实现必须满足这个约束,例如。通过使用符号抢占,COMDAT节或其他机制。

It is intended that two type_info pointers point to equivalent type descriptions if and only if the pointers are equal. An implementation must satisfy this constraint, e.g. by using symbol preemption, COMDAT sections, or other mechanisms.

编译器,链接器和链接加载器必须一起工作在您的可执行文件中创建的 linkedlist< int> 指向与创建 linkedlist< int> 完全相同的type_info对象在你的共享对象中。

The compiler, linker, and link-loader must work together to make sure that a linkedlist<int> created in your executable points to the exact same type_info object that a linkedlist<int> created in your shared object would.

所以,如果你刚刚拿出所有的名字,它不会有任何区别。 (这很容易测试和验证。)

So, if you just took out all the names, it wouldn't make any difference at all. (And this is pretty easily tested and verified.)

但你怎么可能实现这个ABI规范? j_kubik有效地认为这是不可能的,因为您必须在.so文件中保留一些链接时间信息。其中指向明显的答案:在.so文件中保留一些链接时间信息。事实上,你已经必须这样做来处理,例如,加载时重定位;这只是延伸你需要保存。事实上,苹果和GNU / linux / g ++ / ELF都是这样做的。 (这是每个人建立复杂的linux系统必须了解符号可见性和几年前模糊联系的原因的一部分。)

But how could you possibly implement this ABI spec? j_kubik effectively argues that it's impossible because you'd have to preserve some link-time information in the .so files. Which points to the obvious answer: preserve some link-time information in the .so files. In fact, you already have to do that to handle, e.g., load-time relocations; this just extends what you need to preserve. And in fact, both Apple and GNU/linux/g++/ELF do exactly that. (This is part of the reason everyone building complex linux systems had to learn about symbol visibility and vague linkage a few years ago.)

有一个更明显的方法来解决问题:编写一个基于C ++的链接加载器,而不是试图使C ++编译器和链接器一起工作来欺骗一个基于C的链接加载器。但据我所知,没有人试过,因为Be。

There's an even more obvious way to solve the problem: Write a C++-based link loader, instead of trying to make the C++ compiler and linker work together to trick a C-based link loader. But as far as I know, nobody's tried that since Be.

这篇关于是否可以从可执行文件中删除类型名称,同时保持RTTI启用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆