初始化int影响函数返回值 [英] Initialising int affects function return value

查看:294
本文介绍了初始化int影响函数返回值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对不起,这个问题的标题的模糊,但我不知道如何问这个问题。



以下代码在Arduino微处理器(为ATMega328微处理器编译的c ++)上执行时可以正常工作。返回值在代码中的注释中显示:

  //返回字符串中第一个分号的索引
int detectCemicolon(const char * str){

int i = 0;

Serial.print(i =);
Serial.println(i); //打印i = 0

while(i <= strlen(str)){
if(str [i] ==';'){
Serial .print(在i =发现);
Serial.println(i); //打印在i = 2找到
return i;
}
i ++;
}

Serial.println(Error); //不执行
return -999;
}

void main(){
Serial.begin(250000);
Serial.println(detectSemicolon(TE; ST)); //打印2
}

输出2作为第一个分号,如预期的那样。



但是,如果我将 detectSemicolon 函数的第一行更改为 int i; ie没有明确的初始化,我得到问题。具体来说,输出为i = 0(好),发现在i = 2(好),-999(坏!)。



因此,尽管在 return 2; 行之前执行了print语句,但函数返回-999不要在返回-999; 行前立即执行打印声明。有人可以帮助我了解这里发生的情况吗?



我明白,c中的函数内的变量理论上可以包含任何旧的垃圾,除非它们被初始化,但是在这里我具体检查一个没有发生的打印声明,然而...






编辑:感谢每个人都陷入困境,特别是下划线对他们的好回答。似乎未定义的行为确实导致编译器只是跳过涉及 i 的任何内容。以下是detectSemicolon中的serial.prints的一些程序集注释掉:

  void setup(){
Serial。开始(250000);
Serial.println(detectSemicolon(TE; ST)); //打印2
d0:4a e0 ldi r20,0x0A; 10
d2:50 e0 ldi r21,0x00; 0
d4:69 e1 ldi r22,0x19; 25
d6:7c ef ldi r23,0xFC; 252
d8:82 e2 ldi r24,0x22; 34
da:91 e0 ldi r25,0x01; 1
dc:0c 94 3d 03 jmp 0x67a; 0x67a< _ZN5Print7printlnEii>

看起来编译器实际上完全忽视了while循环,并得出结论,输出将始终为 -999,所以甚至不用打扰调用函数,而是硬编码0xFC19。我将再次看看serial.prints启用,使得该函数仍然被调用,但这是一个强大的指针,我认为。






编辑2:



对于那些真正关心的人,这里是一个链接到反汇编的代码,如上所示(在UB的情况下):



https://justpaste.it/vwu8



如果仔细查看,编译器似乎是将寄存器28指定为 i 并将其初始化为零,位于 d8 中。该注册表被视为在while循环中包含 i if语句等,这就是为什么代码看起来正常工作,打印语句按预期输出(例如行122,其中i增加)。



但是,当返回此伪变量时,这对于我们经过尝试的尝试编译器来说已经是一个过程了,它会绘制线,并将我们转储到另一个返回语句(第120行跳转到第132行,将-999加载到寄存器24和25中,然后返回到 main() )。或者至少,这是我能够以我对装配的有限把握得到的。道德的故事是奇怪的东西发生在你的代码的行为是未定义的。

解决方案

像非 - static 所有基本类型的存储持续时间,但不定义 int 不会导致默认初始化。它使变量未初始化。这不意味着 i 只是持有一个随机值。它拥有 no (已知的,有效的)值,因此您不能阅读它。



以下是相关报价C ++ 11标准版,通过 Angew 在评论中。这不是一个新的限制,也不是从那以后改变的:


C ++ 11 4.1 / 1, to-rvalue转换(基本上读取变量的值):如果glvalue引用的对象未初始化,则需要进行此转换的程序具有未定义的行为。


对于一个单位变量的任何读取都会导致未定义的行为,因此任何都可能发生。而不是你的程序继续使用一些未知的默认值来预期地运行,编译器可以使它做任何事情都是绝对的,因为这个行为是未定义的,而且标准对这种情况下应该怎么做



实际上,这通常意味着优化编译器可能会删除任何依赖于UB的代码。没有办法做出正确的决定,所以决定什么也不做任何事情是完全有效的(这也是对大小和速度的优化)。或者如评论者所提及的那样,它可能会保留代码,但是替换尝试读取 i 具有最接近的无关值,或者在不同语句中使用不同的常量等。 / p>

打印一个变量不会像你想的那样检查,所以没有任何区别。没有办法检查未初始化的变量,从而将其接种到UB上。读取变量的行为只有在程序已经写了一个特定值的情况下才被定义。



我们没有必要揣测为什么特定任意类型的UB发生:您只需要修复代码,使其确定性地运行。



为什么要使用它未初始化?这只是学术吗?


Sorry for the vagueness of this question's title, but I'm not sure how to ask this exactly.

The following code, when executed on an Arduino microprocessor (c++ compiled for an ATMega328 microprocessor) works fine. Return values shows in comments in the code:

// Return the index of the first semicolon in a string
int detectSemicolon(const char* str) {

    int i = 0;

    Serial.print("i = ");
    Serial.println(i); // prints "i = 0"

    while (i <= strlen(str)) {
        if (str[i] == ';') {
            Serial.print("Found at i = ");
            Serial.println(i); // prints "Found at i = 2"
            return i;
        }
        i++;
    }

    Serial.println("Error"); // Does not execute
    return -999;
}

void main() {
    Serial.begin(250000);
    Serial.println(detectSemicolon("TE;ST")); // Prints "2"
}

This outputs "2" as the position of the first semicolon, as expected.

However, if I change the first line of the detectSemicolon function to int i; i.e. without the explicit initialisation, I get problems. Specifically, the output is "i = 0" (good), "Found at i = 2" (good), "-999" (bad!).

So the function is returning -999 despite having executed the print statement immediately before a return 2; line and despite never executing the print statement immediately before the return -999; line.

Can someone help me to understand what's happening here? I understand that variables inside functions in c can theoretically contain any old junk unless they're initialised, but here I'm specifically checking in a print statement that this hasn't happened, and yet...


EDIT: Thanks to everyone who's chipped in, and particularly to underscore_d for their great answer. It seems like undefined behaviour is indeed causing the compiler to just skip anything involving i. Here's some of the assembly with the serial.prints within detectSemicolon commented out:

void setup() {
    Serial.begin(250000);
    Serial.println(detectSemicolon("TE;ST")); // Prints "2"
  d0:   4a e0           ldi r20, 0x0A   ; 10
  d2:   50 e0           ldi r21, 0x00   ; 0
  d4:   69 e1           ldi r22, 0x19   ; 25
  d6:   7c ef           ldi r23, 0xFC   ; 252
  d8:   82 e2           ldi r24, 0x22   ; 34
  da:   91 e0           ldi r25, 0x01   ; 1
  dc:   0c 94 3d 03     jmp 0x67a   ; 0x67a <_ZN5Print7printlnEii>

It looks like the compiler is actually completely disregarding the while loop and concluding that the output will always be "-999", and so it doesn't even bother with a call to the function, instead hard coding 0xFC19. I'll have another look with the serial.prints enabled so that the function still gets called, but this is a strong pointer I think.


EDIT 2:

For those who really care, here's a link to the disassembled code exactly as shown above (in the UB case):

https://justpaste.it/vwu8

If you look carefully, the compiler seems to be designating register 28 as the location of i and "initialising" it to zero in line d8. This register gets treated as if it contains i throughout in the while loops, if statements etc, which is why the code appears to work and the print statements output as expected (e.g. line 122 where "i" gets incremented).

However, when it comes to returning this pseudo-variable, this is a step too far for our tried and tried-upon compiler; it draws the line, and dumps us to the other return statement (line 120 jumps to line 132, loading "-999" into registers 24 and 25 before returning to main()).

Or at least, that's as far as I can get with my limited grasp of assembly. Moral of the story is weird stuff happens when your code's behaviour is undefined.

解决方案

Like all basic types of non-static storage duration, declaring but not defining an int does not cause default initialisation. It leaves the variable uninitialised. That does not mean i just holds a random value. It holds no (known, valid) value, and therefore you're not allowed to read it yet.

Here's the relevant quote from the C++11 Standard, via Angew in the comments. This wasn't a new restriction, nor has it changed since then:

C++11 4.1/1, talking about an lvalue-to-rvalue conversion (basically reading a variable's value): "If the object to which the glvalue refers is ... uninitialized, a program that necessitates this conversion has undefined behavior."

Any read of an unitialised variable causes undefined behaviour, and so anything can happen. Rather than your program continuing to function as expected using some unknown default value, compilers can make it do absolutely anything, because the behaviour is undefined, and the Standard imposes no requirements on what should happen in such a scenario.

In practical terms, that usually means an optimising compiler might simply remove any code that relies in any way on UB. There's no way to make a correct decision about what to do, so it's perfectly valid to decide to do nothing (which just happens also to be an optimisation for size and often speed). Or as commenters have mentioned, it might keep the code but replace attempts to read i with the nearest unrelated value to hand, or with different constants in different statements, or etc.

Printing a variable doesn't count as 'checking it' as you think, so that makes no difference. There is no way to 'check' an uninitialised variable and thereby to inoculate yourself against UB. The behaviour of reading the variable is only defined if the program has already written a specific value to it.

There is no point in us speculating on why particular arbitrary types of UB occur: you just need to fix your code so that it operates deterministically.

Why do you want to use it uninitialised anyway? Is this just 'academic'?

这篇关于初始化int影响函数返回值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆