初始化 int 影响函数返回值 [英] Initialising int affects function return value

查看:22
本文介绍了初始化 int 影响函数返回值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

很抱歉这个问题的标题含糊不清,但我不确定如何确切地问这个问题.

Sorry for the vagueness of this question's title, but I'm not sure how to ask this exactly.

以下代码在 Arduino 微处理器(为 ATMega328 微处理器编译的 c++)上执行时运行良好.返回值显示在代码的注释中:

The following code, when executed on an Arduino microprocessor (c++ compiled for an ATMega328 microprocessor) works fine. Return values shows in comments in the code:

// Return the index of the first semicolon in a string
int detectSemicolon(const char* str) {

    int i = 0;

    Serial.print("i = ");
    Serial.println(i); // prints "i = 0"

    while (i <= strlen(str)) {
        if (str[i] == ';') {
            Serial.print("Found at i = ");
            Serial.println(i); // prints "Found at i = 2"
            return i;
        }
        i++;
    }

    Serial.println("Error"); // Does not execute
    return -999;
}

void main() {
    Serial.begin(250000);
    Serial.println(detectSemicolon("TE;ST")); // Prints "2"
}

正如预期的那样,这将输出2"作为第一个分号的位置.

This outputs "2" as the position of the first semicolon, as expected.

但是,如果我将 detectSemicolon 函数的第一行更改为 int i; 即没有显式初始化,我会遇到问题.具体来说,输出是i = 0"(好)、在 i = 2 处找到"(好)、-999"(坏!).

However, if I change the first line of the detectSemicolon function to int i; i.e. without the explicit initialisation, I get problems. Specifically, the output is "i = 0" (good), "Found at i = 2" (good), "-999" (bad!).

因此,尽管在 return 2; 行之前立即执行了打印语句并且从未在 return -999; 之前立即执行了打印语句,但该函数返回 -999> 线.

So the function is returning -999 despite having executed the print statement immediately before a return 2; line and despite never executing the print statement immediately before the return -999; line.

有人能帮我了解这里发生了什么吗?我知道 c 中函数内的变量理论上可以包含任何旧垃圾,除非它们被初始化,但在这里我特别检查了一个打印语句,这还没有发生,但......

Can someone help me to understand what's happening here? I understand that variables inside functions in c can theoretically contain any old junk unless they're initialised, but here I'm specifically checking in a print statement that this hasn't happened, and yet...

感谢所有参与的人,特别是 underscore_d 的精彩回答.似乎未定义的行为确实导致编译器跳过任何涉及 i 的内容.下面是一些带有detectSemicolon 中的serial.prints 注释掉的程序集:

Thanks to everyone who's chipped in, and particularly to underscore_d for their great answer. It seems like undefined behaviour is indeed causing the compiler to just skip anything involving i. Here's some of the assembly with the serial.prints within detectSemicolon commented out:

void setup() {
    Serial.begin(250000);
    Serial.println(detectSemicolon("TE;ST")); // Prints "2"
  d0:   4a e0           ldi r20, 0x0A   ; 10
  d2:   50 e0           ldi r21, 0x00   ; 0
  d4:   69 e1           ldi r22, 0x19   ; 25
  d6:   7c ef           ldi r23, 0xFC   ; 252
  d8:   82 e2           ldi r24, 0x22   ; 34
  da:   91 e0           ldi r25, 0x01   ; 1
  dc:   0c 94 3d 03     jmp 0x67a   ; 0x67a <_ZN5Print7printlnEii>

看起来编译器实际上完全无视 while 循环并得出结论,输出将始终为-999",因此它甚至不打扰调用该函数,而是硬编码 0xFC19.我将在启用 serial.prints 的情况下再看看,以便仍然调用该函数,但我认为这是一个强指针.

It looks like the compiler is actually completely disregarding the while loop and concluding that the output will always be "-999", and so it doesn't even bother with a call to the function, instead hard coding 0xFC19. I'll have another look with the serial.prints enabled so that the function still gets called, but this is a strong pointer I think.

编辑 2:

对于那些真正关心的人,这里有一个链接到上面显示的反汇编代码(在 UB 情况下):

For those who really care, here's a link to the disassembled code exactly as shown above (in the UB case):

https://justpaste.it/vwu8

如果您仔细观察,编译器似乎将寄存器 28 指定为 i 的位置,并在 d8 行中将其初始化"为零.该寄存器被视为在 while 循环、if 语句等中始终包含 i,这就是为什么代码似乎可以工作并且打印语句按预期输出的原因(例如,i"得到的第 122 行递增).

If you look carefully, the compiler seems to be designating register 28 as the location of i and "initialising" it to zero in line d8. This register gets treated as if it contains i throughout in the while loops, if statements etc, which is why the code appears to work and the print statements output as expected (e.g. line 122 where "i" gets incremented).

然而,当谈到返回这个伪变量时,这对于我们久经考验的编译器来说太过分了;它绘制线条,并将我们转储到另一个 return 语句(第 120 行跳转到第 132 行,在返回 main() 之前将-999"加载到寄存器 24 和 25 中).

However, when it comes to returning this pseudo-variable, this is a step too far for our tried and tried-upon compiler; it draws the line, and dumps us to the other return statement (line 120 jumps to line 132, loading "-999" into registers 24 and 25 before returning to main()).

或者至少,这是我对装配的有限掌握所能达到的.故事的寓意是奇怪的事情发生在您的代码行为未定义时.

Or at least, that's as far as I can get with my limited grasp of assembly. Moral of the story is weird stuff happens when your code's behaviour is undefined.

推荐答案

与所有非static 存储持续时间的基本类型一样,声明但不定义int 不会导致默认初始化.它使变量未初始化.这并不 not 意味着 i 只保存一个随机值.它包含no(已知的、有效的)值,因此您还不能读取它.

Like all basic types of non-static storage duration, declaring but not defining an int does not cause default initialisation. It leaves the variable uninitialised. That does not mean i just holds a random value. It holds no (known, valid) value, and therefore you're not allowed to read it yet.

这是来自 C++11 标准的相关引用,来自评论中的 Angew.这不是一个新的限制,从那时起也没有改变:

Here's the relevant quote from the C++11 Standard, via Angew in the comments. This wasn't a new restriction, nor has it changed since then:

C++11 4.1/1,谈论左值到右值的转换(基本上是读取变量的值):如果泛左值所指的对象......未初始化,则需要这种转换的程序具有未定义的行为."

C++11 4.1/1, talking about an lvalue-to-rvalue conversion (basically reading a variable's value): "If the object to which the glvalue refers is ... uninitialized, a program that necessitates this conversion has undefined behavior."

对未初始化变量的任何读取都会导致未定义的行为,因此任何事情都可能发生.与您的程序使用一些未知的默认值继续按预期运行不同,编译器可以让它做任何事情,因为行为是未定义的,并且标准没有要求在这种情况下应该发生什么.

Any read of an unitialised variable causes undefined behaviour, and so anything can happen. Rather than your program continuing to function as expected using some unknown default value, compilers can make it do absolutely anything, because the behaviour is undefined, and the Standard imposes no requirements on what should happen in such a scenario.

实际上,这通常意味着优化编译器可能会简单地删除任何以任何方式依赖 UB 的代码.没有办法对做什么做出正确的决定,所以决定什么都不做是完全有效的(这恰好也是对大小和速度的优化).或者正如评论者所提到的,它可能会保留代码,但替换尝试读取 i 与最接近的无关值,或者在不同的语句中使用不同的常量,等等.

In practical terms, that usually means an optimising compiler might simply remove any code that relies in any way on UB. There's no way to make a correct decision about what to do, so it's perfectly valid to decide to do nothing (which just happens also to be an optimisation for size and often speed). Or as commenters have mentioned, it might keep the code but replace attempts to read i with the nearest unrelated value to hand, or with different constants in different statements, or etc.

打印变量并不像您想象的那样算作检查它",因此没有区别.没有办法检查"一个未初始化的变量,从而使自己接种 UB.读取变量的行为只有在程序已经向其写入特定值时才定义.

Printing a variable doesn't count as 'checking it' as you think, so that makes no difference. There is no way to 'check' an uninitialised variable and thereby to inoculate yourself against UB. The behaviour of reading the variable is only defined if the program has already written a specific value to it.

我们没有必要推测为什么会出现特定的任意类型的 UB:您只需要修复您的代码,使其确定性地运行.

There is no point in us speculating on why particular arbitrary types of UB occur: you just need to fix your code so that it operates deterministically.

你为什么要使用它未初始化呢?这只是学术"吗?

Why do you want to use it uninitialised anyway? Is this just 'academic'?

这篇关于初始化 int 影响函数返回值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆