未定义的行为是否真的有助于现代编译器优化生成的代码? [英] Does undefined behavior really help modern compilers to optimize generated code?

查看:113
本文介绍了未定义的行为是否真的有助于现代编译器优化生成的代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

现代的编译器不够聪明,无法同时生成快速而安全的代码吗?



看看下面的代码:

  std :: vector< int>一个(100); 
for(int i = 0; i <50; i ++)
{a.at(i)= i; }
...

很明显,超出范围的错误永远不会在这里发生,智能编译器可以生成下一个代码:

  std :: vector< int>一个(100); 
for(int i = 0; i< 50; i ++)
{a [i] = i; } // operator []不会检查超出范围的
...

现在,让我们检查以下代码:

  std :: vector< int>一个(unknown_function()); 
for(int i = 0; i <50; i ++)
{a.at(i)= i; }
...

可以将其更改为这样的值:

  std :: vector< int>一个(unknown_function()); 
size_t __loop_limit = std :: min(a.size(),50);
for(int i = 0; i< __loop_limit; i ++)
{a [i] = i; }
if(50> a.size())
{throw std :: out_of_range( oor); }
...

此外,我们知道 int 类型的析构函数和赋值运算符没有副作用。因此,我们可以将代码转换为下一个等效代码:

  size_t __tmp = unknown_function(); 
if(50> __tmp)
{throw std :: out_of_range( oor); }
std :: vector< int> a(__ tmp);
for(int i = 0; i< 50; i ++)
{a [i] = i; }
...

(我不确定C ++是否允许这种优化标准,因为它不包括内存分配/取消分配步骤,但让我们考虑一下允许这种优化的类似C ++的语言。)作为下一个代码:

  std :: vector< int>一个(unknown_function()); 
for(int i = 0; i< 50; i ++)
{a [i] = i; }

因为有一张额外的支票 if(50> __tmp),如果您肯定确定 unknown_function 永远不会返回小于50的值,则实际上不需要此值。但是,性能改善不是很好



请注意,我的问题与这个问题没什么不同:未定义的行为值得吗?这个问题是:性能改进的优势是否大于未定义的行为的缺点。它假定未定义的行为确实有助于优化代码。我的问题是:在没有未定义行为的语言中,是否有可能与在未定义行为的语言中实现几乎相同(甚至更少)的优化水平。



我能想到的是,未定义行为可以真正在很大程度上帮助提高性能的情况是手动内存管理。您永远不会知道指针所指向的地址是否未被释放。有人可以拥有该指针的副本,而不是对其调用 free 。您的指针仍指向相同的地址。为了避免这种不确定的行为,您要么必须使用垃圾收集器(这有其自身的缺点),要么必须维护指向该地址的所有指针的列表,并且当地址释放后,您必须使所有这些指针无效(并且在访问它们之前检查它们是否为 null )。



为多线程环境提供定义的行为可能会导致性能损失



PS我不确定是否可以使用类似 C 的语言来实现已定义的行为,但是也将其添加到了标签中。

解决方案


我的问题是:是否有可能实现几乎相同的目标(也许
少一点)没有未定义
行为的语言的优化级别,就像没有未定义行为的语言。


是,作者:使用类型安全的语言。诸如C和C ++之类的语言之所以需要未定义行为的概念,正是因为它们不是类型安全的(基本上意味着任何指针都可以随时随地指向),因此在许多情况下,编译器无法静态地证明在程序的任何执行中都不会发生违反语言规范的情况,即使实际上是这种情况。这是由于指针分析的严格限制。如果没有未定义的行为,则编译器将不得不插入太多的动态检查,而实际上并不需要太多动态检查,但是编译器无法弄清楚这一点。



请考虑一下,安全的C#代码,其中函数接受指向某种类型的对象(数组)的指针。由于语言和底层虚拟机的设计方式,可以确保指针指向期望类型的对象。这是静态确保的。在某些情况下,C#发出的代码仍然需要边界和类型动态检查,但是与C / C ++相比,实现完全定义的行为所需的动态检查数量很少,而且通常可以承受。许多C#程序所能达到的性能与相应C ++程序的性能相同或略低。尽管这在很大程度上取决于如何编译。


我唯一能想到的未定义行为可以真正帮助
改善的情况性能显着在于手动内存管理。


这不是如上所述的唯一情况。


为多线程环境提供定义的行为也可能会导致
也会导致性能成本。


不确定您的意思。语言指定的内存模型定义了多线程程序的行为。这些模型的范围从非常宽松到非常严格(例如,请参见C ++内存模型)。


Aren't modern compilers smart enough to be able to generate a code that is fast and safe at the same time?

Look at the code below:

std::vector<int> a(100);
for (int i = 0; i < 50; i++)
    { a.at(i) = i; }
...

It's obvious that the out of range error will never happen here, and a smart compiler can generate the next code:

std::vector<int> a(100);
for (int i = 0; i < 50; i++)
    { a[i] = i; } // operator[] doesn't check for out of range
...

Now let's check this code:

std::vector<int> a(unknown_function());
for (int i = 0; i < 50; i++)
    { a.at(i) = i; }
...

It can be changed to such equivalent:

std::vector<int> a(unknown_function());
size_t __loop_limit = std::min(a.size(), 50);
for (int i = 0; i < __loop_limit; i++)
    { a[i] = i; }
if (50 > a.size())
    { throw std::out_of_range("oor"); }
...

Also, we know that the int type doesn't have side effects in its destructor and assignment operator. So we can translate the code to the next equivalent:

size_t __tmp = unknown_function();
if (50 > __tmp)
    { throw std::out_of_range("oor"); }
std::vector<int> a(__tmp);
for (int i = 0; i < 50; i++)
    { a[i] = i; }
...

(I'm not sure that such optimization is allowed by C++ standard, because it excludes memory allocation/deallocation steps, but let's think of C++-like language that allows this optimization.)

And, OK, this optimization is not as fast as the next code:

std::vector<int> a(unknown_function());
for (int i = 0; i < 50; i++)
    { a[i] = i; }

because there is an additional check if (50 > __tmp) which you really don't need if you are certainly sure that unknown_function never returns a value that is less than 50. But the performance improvement is not very high in this case.

Please note that my question is little different than this question: Is undefined behavior worth it? That question is: do advantages of performance improvements outweigh shortcomings of undefined behavior. It assumes that undefined behavior really helps to optimize a code. My question is: is it possible to achieve almost the same (maybe little less) level of optimization in a language without undefined behavior as in a language with undefined behavior.

The only case I can think of where undefined behavior can really help improve performance significantly is manual memory management. You never know if the address a pointer points to is not freed. Someone can have a copy of the pointer than call free on it. Your pointer still point to the same address. To avoid this undefined behavior you either have to use a garbage collector (which has its own disadvantages) or have to maintain a list of all pointers that point to the address, and when the address is freed you have to nullify all those pointers (and check them for null before accessing them).

Providing defined behavior for multi-threaded environment may probably cause performance costs too.

PS I am not sure that a defined behavior may be achieved in C-like language, but added it to the tags too.

解决方案

My question is: is it possible to achieve almost the same (maybe little less) level of optimization in a language without undefined behavior as in a language with undefined behavior.

Yes, by using a type-safe language. Languages such as C and C++ need the concept of undefined behavior precisely because they are not type-safe (which basically means any pointer can point anywhere and anytime), and therefore in way to many cases, the compiler cannot statically prove that no violations of the language specification can occur in any execution of the program, even when that is actually the case. That's because of the hard limitations in pointer analysis. Without undefined behavior, the compiler has to insert too many dynamic checks, most of which are not really needed, but the compiler cannot figure that out.

Consider, for example, safe C# code where a function accepts a pointer to an object of some type (an array). Because of the way the language and the underlying virtual machine are designed, it's guaranteed that the pointer points to the object of the expected type. This is ensured statically. The code emitted by C# still requires bounds and types dynamic checks in certain cases, but compared to C/C++, the number of dynamic checks that would be required to implement fully defined behavior is tiny and typically affordable. Many C# programs can achieve the same as or slightly less than the performance of the corresponding C++ programs. Although that highly depends on how there are compiled.

The only case I can think of where undefined behavior can really help improve performance significantly is manual memory management.

That's not the only case as explained above.

Providing defined behavior for multi-threaded environment may probably cause performance costs too.

Not sure what you mean here. The memory models specified by the language define the behavior of multi-threaded programs. These models can range from very relaxed to very strict (see the C++ memory models for example).

这篇关于未定义的行为是否真的有助于现代编译器优化生成的代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆