如何实现在C可变长度“string'-Y [英] How to implement a variable-length ‘string’-y in C
问题描述
我GOOGLE了不少,但我不能找到如何可变长度的字符串在高级语言普遍实行的信息。我创造我自己的这种语言,我不知道从哪里开始的字符串。
I’ve googled quite a bit, but I can’t find information on how variable-length strings are generally implemented in higher-level languages. I’m creating my own such language, and am not sure where to start with strings.
我有一个描述的 string类型的结构体,然后是分配这样的字符串'一个
创建
功能
/* A safer `strcpy()`, using `strncpy()` and `sizeof()` */
#define STRCPY(TO, FROM) \
strncpy(TO, FROM, sizeof(TO)); TO[sizeof(TO) - 1] = '\0'
struct string {
// …
char native[1024];
};
string String__create(char native[]) {
string this = malloc(sizeof(struct string));
// …
STRCPY(this->native, native);
return this;
}
不过,这将只允许1KB长的字符串。这就是那种傻了,记忆在大多数情况下,一个巨大的浪费。
However, that would only allow 1kb-long strings. That’s sort of silly, and a huge waste of memory in most cases.
由于我必须声明的内存来使用的不知何故的...我怎么去实施,可以(有效的)存储字符的(有效),无限数量的字符串?
Given that I have to declare the memory to be used somehow… how do I go about implementing a string that can (efficiently) store an (effectively) unbounded number of characters?
推荐答案
很多C ++ 的std ::字符串
实现现在使用小字符串优化。在伪code:
Many C++ std::string
implementations now use a "Small String Optimization". In pseudo-code:
struct string {
Int32 length
union {
char[12] shortString
struct {
char* longerString
Int32 heapReservedSpace
}
}
}
这个想法是字符串,最多12个字符都存储在 ShortString短
数组中为止。整个字符串将是连续的,只使用一个高速缓存行。长字符串存储在堆上。这使你在String对象12备用字节。指针不走了这一切,所以你也能记得你多少内存在堆上分配(> =长度
)。这有助于支持你小幅度成长串的场景的。
The idea is that string up to 12 characters are stored in the shortString
array. The entire string will be contiguous and use only a single cache line. Longer strings are stored on the heap. This leaves you with 12 spare bytes in the string object. The pointer doesn't take all of that, so you can also remember how much memory you've allocated on the heap (>=length
). That helps to support scenario's in which you grow a string in small increments.
这篇关于如何实现在C可变长度“string'-Y的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!