检查等号字符串字面值是否存储在同一地址 [英] Check whether equal string literals are stored at the same address

查看:222
本文介绍了检查等号字符串字面值是否存储在同一地址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个使用无序容器的(C ++)库。这些都需要哈希(通常是模板结构的特殊化) std :: hash ),它们存储的元素的类型。在我的例子中,这些元素是封装字符串文字的类,类似于 conststr cpp / language / constexprrel =nofollow>此页面底部。 STL提供了对常量字符指针的专门化,然而,它只计算指针,如这里所述,在备注部分中:


C字符串没有专门化。 std :: hash< const char *>
产生指针的值的哈希值(内存地址),它
不检查任何字符数组的内容。


虽然这是非常快的(我认为),但它不是由C ++标准是否几个相等的字符串文字存储在同一个地址,如此问题。如果它们不是,则不满足哈希的第一个条件:


对于相等的两个参数k1和k2, code> std :: hash< Key>()(k1)==
std :: hash< Key>()(k2)


我想使用提供的特殊化来选择性地计算散列,如果给出了上述保证,或者其他一些算法。虽然回到请求那些包含我的头或构建我的库来定义一个特定的宏是可行的,一个实现定义一个是更可取的。



有任何宏,在任何C ++实现,但主要是g ++和clang,其定义保证几个相等的字符串文字存储在同一地址?



一个例子:

  #ifdef __GXX_SAME_STRING_LITERALS_SAME_ADDRESS__ 
const char str1 [] =abc;
const char str2 [] =abc;
assert(str1 == str2);
#endif


解决方案


在任何C ++实现中,有没有任何宏,但主要是g ++和clang,其定义保证几个相等的字符串文字存储在同一个地址?






尝试在编译单元中合并相同的常量(字符串常量和浮点常量)

如果汇编器和链接器支持,此选项是优化编译的默认选项。使用-fno-merge-constants抑制此行为。



在级别-O,-O2,-O3,-Os启用。





  • Visual Studio 字符串池 / GF 选项:消除重复字符串)




字符串池允许指向多个缓冲区的多个指针作为单个缓冲区的多个指针。在下面的代码中,s和t用相同的字符串初始化。字符串池使它们指向同一内存:




  char * s =这是字符缓冲器; 
char * t =这是一个字符缓冲区;

注意:虽然MSDN使用 char * 字符串应使用 const char *




  • clang 显然也有 -fmerge-constants 选项,但我找不到很多,除了在 - 帮助 section,所以我不知道它是否真的相当于gcc的一个:




不允许合并常量







无论如何,字符串文字的存储方式(许多将它们存储在程序的只读部分中)。



不是在可能的实现相关的hack上构建我只能建议使用 std :: string 而不是C风格的字符串:它们的行为将正如你期望的。



您可以使用 emplace()方法在容器中就地构建 std :: string

  std :: unordered_set< std :: string> my_set; 
my_set.emplace(Hello);


I am developing a (C++) library that uses unordered containers. These require a hasher (usually a specialization of the template structure std::hash) for the types of the elements they store. In my case, those elements are classes that encapsulate string literals, similar to conststr of the example at the bottom of this page. The STL offers an specialization for constant char pointers, which, however, only computes pointers, as explained here, in the 'Notes' section:

There is no specialization for C strings. std::hash<const char*> produces a hash of the value of the pointer (the memory address), it does not examine the contents of any character array.

Although this is very fast (or so I think), it is not guaranteed by the C++ standard whether several equal string literals are stored at the same address, as explained in this question. If they aren't, the first condition of hashers wouldn't be met:

For two parameters k1 and k2 that are equal, std::hash<Key>()(k1) == std::hash<Key>()(k2)

I would like to selectively compute the hash using the provided specialization, if the aforementioned guarantee is given, or some other algorithm otherwise. Although resorting back to asking those who include my headers or build my library to define a particular macro is feasible, an implementation defined one would be preferable.

Is there any macro, in any C++ implementation, but mainly g++ and clang, whose definition guarantees that several equal string literals are stored at the same address?

An example:

#ifdef __GXX_SAME_STRING_LITERALS_SAME_ADDRESS__
const char str1[] = "abc";
const char str2[] = "abc";
assert( str1 == str2 );
#endif

解决方案

Is there any macro, in any C++ implementation, but mainly g++ and clang, whose definition guarantees that several equal string literals are stored at the same address?

Attempt to merge identical constants (string constants and floating-point constants) across compilation units.

This option is the default for optimized compilation if the assembler and linker support it. Use -fno-merge-constants to inhibit this behavior.

Enabled at levels -O, -O2, -O3, -Os.

  • Visual Studio has String Pooling (/GF option : "Eliminate Duplicate Strings")

String pooling allows what were intended as multiple pointers to multiple buffers to be multiple pointers to a single buffer. In the following code, s and t are initialized with the same string. String pooling causes them to point to the same memory:

char *s = "This is a character buffer";
char *t = "This is a character buffer";

Note: although MSDN uses char* strings literals, const char* should be used

  • clang apparently also has the -fmerge-constants option, but I can't find much about it, except in the --help section, so I'm not sure if it really is the equivalent of the gcc's one :

Disallow merging of constants


Anyway, how string literals are stored is implementation dependent (many do store them in the read-only portion of the program).

Rather than building your library on possible implementation-dependent hacks, I can only suggest the usage of std::string instead of C-style strings : they will behave exactly as you expect.

You can construct your std::string in-place in your containers with the emplace() methods :

    std::unordered_set<std::string> my_set;
    my_set.emplace("Hello");

这篇关于检查等号字符串字面值是否存储在同一地址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆