存储字符数组的字符串不空字符 [英] Storing a string in an array of chars without the null character

查看:195
本文介绍了存储字符数组的字符串不空字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我被斯蒂芬·普拉塔阅读C ++的Primer Plus。他给出了这样的例子:

I'm reading the C++ Primer Plus by Stephen Prata. He gives this example:

char dog[8] = { 'b', 'e', 'a', 'u', 'x', ' ', 'I', 'I'}; // not a string!
char cat[8] = {'f', 'a', 't', 'e', 's', 's', 'a', '\0'}; // a string!

的意见是:

这两个阵列的是炭数组,但是只有第二是一个string.The空字符
  扮演C风格字符串的基础性作用。例如,C ++有许多功能
  处理字符串,包括那些通过处理字符串由cout.They使用的所有工作character-
  由字符,直到它们到达空字符。如果你问COUT显示一个不错的字符串
  就像猫在preceding例如,它会显示第一七个字符,检测空
  性格,并停止。但是,如果你无礼地告诉COUT显示狗阵
  从preceding为例,这是不是一个字符串,COUT打印在八个字母
  数组,然后不断通过内存逐字节行军,除preting每个字节作为
  字符打印,直到它到达一个空字符。因为空字符,这真的是
  字节设为零,趋向于在存储器常见,损伤通常很快载;
  然而,你不应该把非字符串的字符数组的字符串。

Both of these arrays are arrays of char, but only the second is a string.The null character plays a fundamental role in C-style strings. For example, C++ has many functions that handle strings, including those used by cout.They all work by processing a string character- by-character until they reach the null character. If you ask cout to display a nice string like cat in the preceding example, it displays the first seven characters, detects the null character, and stops. But if you are ungracious enough to tell cout to display the dog array from the preceding example, which is not a string, cout prints the eight letters in the array and then keeps marching through memory byte-by-byte, interpreting each byte as a character to print, until it reaches a null character. Because null characters, which really are bytes set to zero, tend to be common in memory, the damage is usually contained quickly; nonetheless, you should not treat nonstring character arrays as strings.

现在,如果我声明全局变量,就像这样:

Now, if a declare my variables global, like this:

#include <iostream>
using namespace std;

char a[8] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'};
char b[8] = {'1', '2', '3', '4', '5', '6', '7', '8'};

int main(void)
{
    cout << a << endl;
    cout << b << endl;

    return 0;
}

的输出将是:

abcdefgh12345678
12345678

所以,事实上,COUT不断通过内存踏着逐字节,但只有到第二个字符数组的结束。同样的事情发生与字符数组的任意组合。我想,所有的其他地址被初始化为0,这就是为什么COUT站。这是真的?如果我是这样的:

So, indeed, the cout "keeps marching through memory byte-by-byte" but only to the end of the second character array. The same thing happens with any combination of char array. I'm thinking that all the other addresses are initialized to 0 and that's why the cout stop. Is this true? If I do something like:

for (int i = 0; i < 100; ++i)
{
    cout << *(&a + i) << endl;
}

我得到的输出大部分是空的空间(如95%,也许),但不是无处不在。

I'm getting mostly empty space at output (like 95%, perhaps), but not everywhere.

如果,但是,我宣布我的字符数组有点短,如:

If, however, i declare my char arrays a little bit shorter, like:

char a[3] = {'a', 'b', 'c'};
char b[3] = {'1', '2', '3'};

保持所有其他的事情一样,我得到的输出如下:

keeping all other things the same, I'm getting the following output:

abc
123

现在的COUT甚至没有让过去的第一个字符数组,更不用说第二。这究竟是为什么?我检查了内存地址,他们是连续的,就像在第一个场景。例如,

Now the cout doesn't even get past the first char array, not to mention the second. Why is this happening? I've checked the memory addresses and they are sequential, just like in the first scenario. For example,

cout << &a << endl;
cout << &b << endl;

003B903C
003B9040

为什么是在这种情况下,不同的行为?为什么它不读超越第一字符数组?

Why is the behavior different in this case? Why doesn't it read beyond the first char array?

和,最后,如果我做我的声明里面的主要变量,然后我得到的行为由煎饼建议,即大量的垃圾得到的印刷前,某处出现空字符。

And, lastly if I do declare my variables inside main, then I do get the behavior suggested by Prata, namely, a lot of junk gets printed before, somewhere a null character is reached.

我猜测,在第一种情况中,字符数组声明在堆上,而这被初始化为0(但不是无处不在,为什么?)和cout有不同的特征的字符数组的长度(为什么?)

I'm guessing that in the first case, the char array is declared on the heap and that this is initialized to 0 (but not everywhere, why?) and cout behaves differently based on the length of the char array (why?)

我使用Visual Studio 2010中的这些例子。

I'm using Visual Studio 2010 for these examples.

推荐答案

它看起来像你的C ++编译器在4字节块分配空间,使每个对象都有为4(十六进制地址,在多地址转储是被4整除)。编译器喜欢,因为他们喜欢以确保较大的数据类型,如要做到这一点 INT 浮动(4字节宽)的对准以4字节边界。编译器喜欢这样做,因为某些类型的计算机硬件需要更长的时间来加载/移动/存储未对齐 INT 浮动

It looks like your C++ compiler is allocating space in 4-byte chunks, so that every object has an address that is a multiple of 4 (the hex addresses in your dump are divisible by 4). Compilers like to do this because they like to make sure larger datatypes such as intand float (4 bytes wide) are aligned to 4-byte boundaries. Compilers like to do this because some kinds of computer hardware take longer to load/move/store unaligned int and float values.

在你的第一个例子中,每个阵列需要8个字节的内存 - 一个字符填充单字节 - 这样的编译器分配正好8个字节。在第二个例子中每个阵列是3个字节,所以编译器分配4个字节,与数据填充前3个字节,并离开第4个字节未使用。

In your first example, each array need 8 bytes of memory - a char fills a single byte - so the compiler allocates exactly 8 bytes. In the second example each array is 3 bytes, so the compiler allocates 4 bytes, fills the first 3 bytes with your data, and leaves the 4th byte unused.

现在在第二种情况下出现未使用的字节中弥漫着空这就解释了为什么 COUT 停在字符串的结尾。但正如其他人所指出的那样,你不能依赖于被初始化为任何特定的值未使用的字节,所以程序的行为不能得到保证。

Now in this second case it appears the unused byte was filled with a null which explains why cout stopped at the end of the string. But as others have pointed out, you cannot depend on unused bytes to be initialized to any particular value, so the behaviour of the program cannot be guaranteed.

如果你改变你的样品阵列有4个字节的程序将表现为在第一个例子。

If you change your sample arrays to have 4 bytes the program will behave as in the first example.

这篇关于存储字符数组的字符串不空字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆