代码覆盖率优化 [英] Code coverage with optimization

查看:48
本文介绍了代码覆盖率优化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我有一堆针对我的 C++ 项目的单元测试,但我(还)没有测试代码覆盖率.我正在使用 -O3 优化标志编译测试以暴露潜在的细微错误,但似乎如果我想使用 gcov 之类的工具收集覆盖率信息,任何优化标志都必须是禁用.我是否应该构建两次测试(一个带有 -O3,另一个没有)?这个问题通常如何处理?

Currently I have a bunch of unit tests for my C++ project, but I am not (yet) testing the code coverage. I am compiling the tests with -O3 optimization flag to expose potential subtle bugs, but it seems if I want to collect coverage information with tools like gcov, any optimization flag must be disabled. Should I build the tests twice (one with -O3, the other without)? How is this problem usually dealt with?

推荐答案

通常会执行多种测试来确保软件的质量,以及不同的编译器选项标准.

There are typically many kinds of tests that one performs to assure the quality of the software, and different criteria for what compiler options.

通常,构建系统提供两种或多种构建选择,例如:

Typically, a build system offers two or more choices of builds, for example:

调试:-O0(无优化)与断言

Debug: -O0 (no optimisation) with asserts

发布:没有断言的更高优化"(-O2、-Os 或 -O3,取决于什么是最佳"的).这通常是您将代码交付给客户的模式.

Release: "higher optimisation" (-O2, -Os or -O3 depending on what is "best" for your project) without asserts. This is usually the mode which you deliver the code to customers.

有时会有Release+Asserts",这样您仍然可以在运行时检查代码的正确性,并具有一定的性能.

Sometimes there are "Release+Asserts" so that you can still do checking of correctness in the code while running with some semblance of performance.

我认为测试可以分为以下几类:

Here are some categories that I think tests can be classed into:

  1. 功能正确性(又名阳性测试").这是您检查代码在正常情况下正常工作"的地方.运行调试和发布.

  1. Functional correctness (aka "positive tests"). This is where you check that "the code works correctly under normal circumstances". Run both Debug and Release.

阴性测试.检查错误条件是否正常工作 - 传递应该给出错误的垃圾值(不存在的文件"应该给出 E_NO_SUCH_FILE).通常包括调试和发布.

Negative tests. Check that error conditions work correctly - passing rubbish values that should give errors ("file that doesn't exist" should give E_NO_SUCH_FILE). Typicaly both debug and release.

压力测试 - 运行严苛的测试,以检查软件在长时间运行时是否正常运行,有很多线程等.通常是调试模式 - 可能两者都有.

Stress tests - running harsh tests that check that the software behaves correctly when you run it for long times, with lots of threads, etc, etc. Typically debug mode - maybe both.

覆盖范围.运行一组测试以确保您覆盖所有路径"(通常带有一定程度的未覆盖",例如您应该覆盖 95% 的功能和 85% 的分支——因为某些条件可能极难实现无需手动检测代码 - 只有在磁盘完全满时或操作系统无法创建新进程时才会出现错误).通常编译为 Debug.

Coverage. Run a set of tests to ensure that you "cover all paths" (often with a degree of "not covered", such as you should cover 95% of functions, and 85% of branches - since some conditions may be extremely difficult to achieve without manually instrumenting the code - there are errors that only occur when the disk is completely full, or when the OS can't create a new process). Typically compiled as Debug.

容错测试.一种否定测试"形式,您可以为内存分配和类似功能插入模拟"功能,依次或随机模拟故障,以发现未检测到错误并且代码失败作为后续结果的情况更早的错误,而不是在正确的地方产生正确的错误.同样,通常使用 Debug 运行 - 但也可能值得在 Release 中运行.

Fault tolerance tests. A form of "negative tests" where you insert a "mock" functionality for the memory allocations and similar, that simulates failures either sequentially or at random, to discover cases where errors are not detected and the code fails as a follow-on consequence of an earlier error, rather than producing the correct error at the correct place. Again, typically run with Debug - but it may be worth running in Release as well.

性能测试.您测量程序性能的地方 - 每秒生成的帧数,编译器中的每秒行数或文件下载系统中的每小时千兆字节等.这应该按照版本进行编译,因为在未优化"代码中的运行性能是几乎总是毫无意义.

Performance testing. Where you measure the performance of your program - frames per second generated, lines per second in a compiler or gigabytes per hour in a file download system, etc. This should be compiled as per Release, as running performance in "not optimised" code is nearly always pointless.

对于复杂的软件产品,你经常不得不在运行一切"和花费的时间"之间做出妥协——例如,在调试和发布模式下运行所有​​ 4000 个功能测试可能需要 12 小时,仅运行调试模式需要7小时,所以最好.这种妥协是通常的工程决策"——在理想的世界中,你会这样做,但在现实世界中,我们必须妥协,这就是为什么我认为这种测试配置是正确的".

For complex software products, you often have to compromise between "running everything" and "the time it takes" - for example, running ALL 4000 functional tests in both debug and release mode may take 12 hours, running only Debug mode take 7 hours, so preferrable. This compromise is the usual "engineering decision" - "In an ideal world, you'd do this, but in the real world, we have to compromise, and here's why I think this configuration of tests is right".

例如,许多测试系统都在对源代码的每次更改进行轻量级测试 [在工程师他/她自己我认为这可行"之后],每晚进行更重的测试,周末进行更多测试.这允许在运行所有测试所需的时间和一名工程师进行小改动所需的时间之间进行折衷.

Many test systems are running light testing on every change to the source code [after "I think this works" from the engineer him/herself], heavier testing each night, and more tests over a weekend, for example. This allows a compromise between the time it takes to run ALL tests and the time it takes one engineer to make a small change.

这篇关于代码覆盖率优化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆