代码正确性和测试策略 [英] Code correctness, and testing strategies

查看:65
本文介绍了代码正确性和测试策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨列表。


您使用什么策略来确保新代码的正确性?


具体来说,如果你只是写了100行新的Python代码,然后:


1)你如何测试新代码?

2)你如何确保代码将在将来正常工作?


短版本:


对于(1)我在编写之前彻底(手动)测试代码,之前

签入版本控制。


对于(2)我防御性地编码。


长版本:


对于(2),我有很多错误检查,类似于合同(post&

前置条件,不变量)。我已经阅读了有关Python库的文章,这有助于

正式化[1] [2],但我没有看到使用

常规ifs和断言(以及一些缺点,比如额外的

复杂性)。简单的ifs足以支持Python内置库:-)

[1] PEP 316: http://www.python.org/dev/peps/pep-0316/

[2]实施:
http://aspn.activestate .com / ASPN / Coo ... / Recipe / 436834


旁白:使用断言的正确情况是什么

Python中的语句?我想用它们来强制执行''合同''

,因为它们可以快速输入,但是来自文档:


" ;断言语句是将调试断言

插入程序的一种便捷方式:


所以对我来说听起来像''断言''语句是仅在

调试时有用,而不是当应用程序处于活动状态时,您还需要

(特别是!)希望它强制执行合同。另外,断言可以用-O删除

,你只得到AssertionError,其中

ValueError等可能更合适。

至于第1点(你如何测试新代码?):


我喜欢自动单元测试的想法。然而,在实践中我发现他们需要很长时间才能编写和测试,特别是如果你想要b $ b有良好的覆盖率(不仅仅是线条,还有可能的逻辑分支) )。


所以相反,我更喜欢手动彻底测试新代码,而只需

然后检入版本控制。我觉得,如果你受到纪律处分,

那么单元测试主要用于:


1)维护遗留代码

2)超过1个人在项目上工作


最近的一个个人例子:


我的工作站是Debian Unstable框。我喜欢定期升级

并试用新的图书馆&应用版本。通常这不会造成主要问题。 sqlalchemy是个例外。它的API似乎每隔几个月就会改变一次,导致使用

旧API的代码出现警告和破坏。这种情况经常发生,对于一个项目,我花了一天的时间来为ORM使用代码添加单元测试,并获得单元

测试高达100%的覆盖率。这些测试应该允许我在将来快速获取并修复我的应用程序中的所有sqlalchemy API破损。

破损也让我想要完全停止使用ORM,但是它需要更长时间才能切换到仅限SQL的代码而不是保持单位

测试最新:-)


我在检查前的测试代码方法如下:


1)添加raise'UNTESTED''"行到每个函数的顶部

2)运行脚本

3)查看脚本终止的位置

4)之前添加打印行检查变量值的例外

5)重新运行并检查值是否具有预期值。

6)删除打印并加UNTESTED" ''行

7)在函数体中添加自由的''加注'UNTESTED'''行。

8.1)对于短函数,在每一行之前(如果有必要的话)

8.2)对于更长的功能,在每个逻辑进入/退出点之前和之后

(阻止,退出,返回,抛出等):


例如,之前:


如果A():

B()

C()

D()

E()


之后:


提高''UNTESTED''

如果A():

提高''UNTESTED''

B()

C()

D()

加薪''未经测试''

加薪''联合国测试''

E()


8.2.1)后来我添加了raise'UNTESTED''"如果有必要,还可以在

块中的每一行之前的行。


9)重复步骤2到8,直到脚本停止抛出异常
10)检查脚本中仍然存在加注UNTESTED"''行

11)导致这些代码段也运行(有时我需要

暂时将vars设置为脚本中不可能的值,因为

逻辑将永远不会运行。


这是我最大的一个单元测试的问题。你如何单位

几乎从不运行的测试代码?我能想到的唯一简单方法是

,代码有''如果<某些几乎不可能的条件< busy

运行测试用例XYZlines''。我知道我打算做''假''测试返回错误值的
类,然后将这些对象传递给被测试代码的
。但是这可能需要很长时间,即使这样,也不能确保
能够达到你所有的错误处理代码。


上述方法适用于我。它的速度相当快,而且
比编写和测试精细的单元测试快得多。


最后,我的主要问题是:


1)我的正确性策略是否存在任何明显问题?


2)我是否(无论最初需要的时间)仍然要添加

单位测试一切?我想听听XP /敏捷编程

的倡导者在这个问题上要说些什么。


3)有简单快捷的方法吗?编写和测试(完成)单元测试?


4)还有其他评论吗?


感谢您的时间。


David。

解决方案

David写道:


具体来说,如果你刚刚编写了100行新的Python代码,那么:

1)你如何测试新代码?

2)你如何确保代码将来会正常工作吗?


短版本:


对于(1)我在编写代码时彻底(手动)测试代码,

签入版本控制之前。


对于(2)我防御性编码。



....


至于第1点(你如何测试新代码?):

我喜欢这个想法自动化单元te STS。然而,在实践中我发现他们需要很长时间才能编写和测试,特别是如果你想要b $ b有良好的覆盖率(不仅仅是线条,还有可能的逻辑分支) )。



这就是为什么我不情愿地接受XP人的观点:

如果你写了测试_as_你开发(就我而言

w / o重新执行;他们会让你写下来_before_),你会

有一系列测试可以证明正确性或

基于其_should_做什么的代码缺陷。如果您在编写代码后编写了
测试,那么您将根据代码_actually_does_编写的测试结果为
。你不想要后者;

测试是脆弱的。测试不符合需求,而是他们的b $ b匹配实现。因此,你需要在每次局部重写时放弃更多的测试。


1)添加raise'UNTESTED'' "每行函数顶部的行



不推荐使用字符串异常。只需提高UNTESTED(并允许

访问未定义的全局错误)。

....<描述如何手动执行代码覆盖> .. 。


11)导致这些代码段也被运行(有时我需要

暂时将vars设置为脚本中不可能的值,因为

逻辑将永远不会运行。


这是我在单元测试中遇到的最大问题之一。你如何单位

几乎从不运行的测试代码?我能想到的唯一简单方法是

,代码有''如果<某些几乎不可能的条件< busy

运行测试用例XYZlines''。我知道我打算做''假''测试返回错误值的
类,然后将这些对象传递给被测试代码的
。但这可能需要很长时间,即使这样,也不能保证
无法达到所有错误处理代码。



啊,但现在你的测试是脆弱的;他们只为你现在拥有的代码工作。

如果你想确保你的测试有代码覆盖率,那么XP

方法是:

为你需要的行为写一个测试。

观看它失败。

修复代码,使所有测试通过。

泡沫,冲洗,重复。

你不应该有未经测试的代码,因为没有测试使你写了它的b $ b。如果你想进行代码覆盖,找到代码覆盖率

工具并在运行单元测试时计算你的代码。

--Scott David Daniels
Sc *********** @ Acm.Org


2008年5月24日星期六17:51:23 +0200

David< wi ****** @ gmail.comwrote:


基本上,使用TDD,您首先编写测试,然后

的代码在适当时通过/未通过测试。但是,当你正在编写

代码时,你也会想到很多极端情况,你也应该使用
处理。这样做的自然方法是首先将它们添加到代码中。

但是使用TDD你必须先为角落案例编写一个测试,如果设置的话,甚至是

测试代码非常复杂。所以,你有这些

选项:


- 根据需要花费尽可能多的时间来放置一个复杂的测试用例。



绝对。你可能会认为这会减慢你的速度,但我可以向你保证,从长远来看,你节省了自己的时间。


- 不要在你的代码中添加角落案例,因为你不能(没有时间)为它编写测试。



如果你没有时间编写完整的,正常工作的,经过测试的代码,那么你b
你的老板/客户有问题,而不是你的方法。


- 首先在代码中添加角落案例处理,并尝试稍后添加

测试时间到了。



从不!它不会发生。


必须为所有代码编写测试需要时间。而不是例如:10小时

编码并说1/2小时手动测试,你花费例如:2-3小时

写所有测试,10码。



在传统开发中,10小时的代码需要90小时的b
测试,调试和维护。在TDD下(和一般的敏捷)

你花了20个小时进行测试和编码。如果你想要提供一个好的产品,那就是真正的经济学。


我认为自动化测试对于可维护性,

确保你或其他开发人员不会打破

行。但是这些好处必须值得花时间(并且通常不会花费)来添加/维护测试。



我可以从经验中向你保证它总值得花时间。


如果我确实开始了做某种TDD,它会更多的是'冒烟'/ b $ b测试''品种。用各种参数调用所有函数,测试一些常见的场景,所有的低挂果。但是,除非我有足够的时间,否则不要花费大量的时间来测试所有可能的场景和角落情况,

100%的覆盖率等等。



Penny明智,愚蠢。在你的客户抱怨之后花点时间或稍后花点时间



我将会阅读更多关于这个主题的信息(感谢Ben的链接)。

也许我有一些误解。



也许只是缺乏经验。阅读实际案例研究。


-

D''Arcy J.M. Cain< da *** @ druid.net |民主是三只狼
http://www.druid.net/darcy/ |和一只羊投票

+1 416 425 1212(mod#0082)(eNTP)|什么是晚餐。


2008年5月24日星期六21:14:36 +0200

David< wi *** ***@gmail.com写道:


如果你做一个总是

的测试用例失败了它被认为是作弊; TODO:制作一个合适的测试用例信息?



是的。最好每天提醒一些代码需要完成。$ block $ class =post_quotes>
虽然可以描述所有问题在docs中,它可能非常难以写出实际的测试代码。



可能很难开始,但是一旦你的框架到位,它就变得非常容易了。


例如:健全性测试。函数可以测试情况

,这些情况永远不会发生,或者很难重现。你如何对这些单位进行
测试呢?



相信我,成千上万读这篇文章的人记得情况

哪里发生了不可能发生的事情。


我头脑中的几个例子:


*检查硬件缺陷的代码(奔腾浮点数,

内存或磁盘错误等)。


*检查文件小于1 TB的代码(但你只需要
测试环境中有320 GB硬盘。


*检查机器是否在一年前重启的代码。


等等。这些我会通过在代码中临时更改

变量来手动测试,然后将其更改回来。要对这些进行单元测试,你需要编写模拟函数并安排测试代码来调用它们而不是python内置函数。



是的但是模拟函数可以是真实函数的包装器

只会改变你要测试的结果。


例如:你用参数X调用函数MyFunc,并期望得到结果Y.


MyFunc调用__private_func1和__private_func2。 br />

您可以在单元测试中检查MyFunc返回结果Y,但是你

不应该直接检查__private_func1和__private_func2,即使

他们真的应该进行测试(也许他们有时会有不需要的一面

效果与MyFunc的返回值无关)。



测试__private_func1和__private_func2不是你的工作,除非

你正在编写MyFunc。


取决于bug的类型。如果这是一个破坏单元测试的错误,那么可以很快找到它。单元测试对于他们没有明确掩盖的错误没有帮助。例如,一个接一个,内存泄漏,CPU负载,
副作用(在单元测试测试之外),等等。



否,但是当你发现你的代码因为这些问题而中断时,当你编写新的单元测试时,这些问题就是
。 />


但是一旦你追踪上面的问题,你可以写更多

单元测试来捕捉未来的确切错误。这是一个案例

,我赞成单元测试。



是的!单元测试的最大优势之一是,您永远不会向客户端提供两次相同的错误。提供软件
带有错误的
很糟糕,但在报告和修复后报告并修复了同样的错误。


-

D''Arcy JM Cain< da *** @ druid.net |民主是三只狼
http://www.druid.net/darcy/ |和一只羊投票

+1 416 425 1212(mod#0082)(eNTP)|什么是晚餐。


Hi list.

What strategies do you use to ensure correctness of new code?

Specifically, if you''ve just written 100 new lines of Python code, then:

1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

Short version:

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

For (2) I code defensively.

Long version:

For (2), I have a lot of error checks, similar to contracts (post &
pre-conditions, invariants). I''ve read about Python libs which help
formalize this[1][2], but I don''t see a great advantage over using
regular ifs and asserts (and a few disadvantages, like additional
complexity). Simple ifs are good enough for Python built-in libs :-)

[1] PEP 316: http://www.python.org/dev/peps/pep-0316/
[2] An implementation:
http://aspn.activestate.com/ASPN/Coo.../Recipe/436834

An aside: What is the correct situation in which to use assert
statements in Python? I''d like to use them for enforcing ''contracts''
because they''re quick to type, but from the docs:

"Assert statements are a convenient way to insert debugging assertions
into a program:"

So to me it sounds like ''assert'' statements are only useful while
debugging, and not when an app is live, where you would also
(especially!) want it to enforce contracts. Also, asserts can be
removed with -O, and you only ever get AssertionError, where
ValueError and the like might be more appropriate.

As for point 1 (how do you test the new code?):

I like the idea of automated unit tests. However, in practice I find
they take a long time to write and test, especially if you want to
have good coverage (not just lines, but also possible logic branches).

So instead, I prefer to thoroughly test new code manually, and only
then check in to version control. I feel that if you are disciplined,
then unit tests are mainly useful for:

1) Maintenance of legacy code
2) More than 1 person working on a project

One recent personal example:

My workstation is a Debian Unstable box. I like to upgrade regularly
and try out new library & app versions. Usually this doesn''t cause
major problems. One exception is sqlalchemy. It''s API seems to change
every few months, causing warnings and breakage in code which used the
old API. This happened regularly enough that for one project I spent a
day adding unit tests for the ORM-using code, and getting the unit
tests up to 100% coverage. These tests should allow me to quickly
catch and fix all sqlalchemy API breakages in my app in the future.
The breakages also make me want to stop using ORM entirely, but it
would take longer to switch to SQL-only code than to keep the unit
tests up to date :-)

My ''test code thoroughly before checkin'' methodology is as follows:

1) Add "raise ''UNTESTED''" lines to the top of every function
2) Run the script
3) Look where the script terminated
4) Add print lines just before the exception to check the variable values
5) Re-run and check that the values have expected values.
6) Remove the print and ''raise "UNTESTED"'' lines
7) Add liberal ''raise "UNTESTED"'' lines to the body of the function.
8.1) For short funcs, before every line (if it seems necessary)
8.2) For longer funcs, before and after each logic entry/exit point
(blocks, exits, returns, throws, etc):

eg, before:

if A():
B()
C()
D()
E()

after:

raise ''UNTESTED''
if A():
raise ''UNTESTED''
B()
C()
D()
raise ''UNTESTED''
raise ''UNTESTED''
E()

8.2.1) Later I add "raise ''UNTESTED''" lines before each line in the
blocks also, if it seems necessary.

9) Repeat steps 2 to 8 until the script stops throwing exceptions
10) Check for ''raise "UNTESTED"'' lines still in the script
11) Cause those sections of code to be run also (sometimes I need to
temporarily set vars to impossible values inside the script, since the
logic will never run otherwise)

And here is one of my biggest problem with unit tests. How do you unit
test code which almost never runs? The only easy way I can think of is
for the code to have ''if <some almost impossible conditionor <busy
running test case XYZlines''. I know I''m meant to make ''fake'' testing
classes which return erroneous values, and then pass these objects to
the code being tested. But this can take a long time and even then
isn''t guaranteed to reach all your error-handling code.

The above methodology works well for me. It goes fairly quickly, and
is much faster than writing and testing elaborate unit tests.

So finally, my main questions:

1) Are there any obvious problems with my ''correctness'' strategies?

2) Should I (regardless of time it takes initially) still be adding
unit tests for everything? I''d like to hear what XP/agile programming
advocates have to say on the subject.

3) Are there easy and fast ways to do write and test (complete) unit tests?

4) Any other comments?

Thanks for your time.

David.

解决方案

David wrote:

Specifically, if you''ve just written 100 new lines of Python code, then:
1) How do you test the new code?
2) How do you ensure that the code will work correctly in the future?

Short version:

For (1) I thoroughly (manually) test code as I write it, before
checking in to version control.

For (2) I code defensively.

....

As for point 1 (how do you test the new code?):
I like the idea of automated unit tests. However, in practice I find
they take a long time to write and test, especially if you want to
have good coverage (not just lines, but also possible logic branches).

This is why I have reluctantly come to accept the XP people''s view:
if you you write the tests _as_ you develop (that is as far as I go
w/o re-enforcement; they would have you write them _before_), you will
have a body of tests that work to demonstrate the correctness or
deficiencies of your code based on what it _should_ do. If you write
tests after you''ve written the code, you will write tests that are
based on what your code _actually_does_. You don''t want the latter;
the tests are brittle. The tests don''t match needs, rather they
match implementations. Therefore you''ll need to discard more tests at
every local rewrite.

1) Add "raise ''UNTESTED''" lines to the top of every function

String exceptions are deprecated. Just raise UNTESTED (and let the
access to undefined global error be the issue).
....<describes how to do code coverage by hand>...

11) Cause those sections of code to be run also (sometimes I need to
temporarily set vars to impossible values inside the script, since the
logic will never run otherwise)

And here is one of my biggest problem with unit tests. How do you unit
test code which almost never runs? The only easy way I can think of is
for the code to have ''if <some almost impossible conditionor <busy
running test case XYZlines''. I know I''m meant to make ''fake'' testing
classes which return erroneous values, and then pass these objects to
the code being tested. But this can take a long time and even then
isn''t guaranteed to reach all your error-handling code.

Ah, but now you tests are "brittle"; they only work for the code you
have now.
If you want to make sure you have code coverage with your test, the XP
way is:
Write a test for behavior you need.
Watch it fail.
Fix the code so all tests pass.
Lather, rinse, repeat.
You should not have untested code, because there was no test that made
you write it. If you want to do code coverage, find a code coverage
tool and count your code while runnign your unit tests.
--Scott David Daniels
Sc***********@Acm.Org


On Sat, 24 May 2008 17:51:23 +0200
David <wi******@gmail.comwrote:

Basically, with TDD you write the tests first, then the code which
passes/fails the tests as appropriate. However, as you''re writing the
code you will also think of a lot of corner cases you should also
handle. The natural way to do this is to add them to the code first.
But with TDD you have to first write a test for the corner case, even
if setting up test code for it is very complicated. So, you have these
options:

- Take as much time as needed to put a complicated test case in place.

Absolutely. You may think that it is slowing you down but I can assure
you that in the long run you are saving yourself time.

- Don''t add corner case to your code because you can''t (don''t have
time to) write a test for it.

If you don''t have time to write complete, working, tested code then you
have a problem with your boss/client, not your methodology.

- Add the corner case handling to the code first, and try to add a
test later if you have time for it.

Never! It won''t happen.

Having to write tests for all code takes time. Instead of eg: 10 hours
coding and say 1/2 an hour manual testing, you spend eg: 2-3 hours
writing all the tests, and 10 on the code.

In conventional development, 10 hours of code requires 90 hours of
testing, debugging and maintenance. Under TDD (and agile in general)
you spend 20 hours testing and coding. That''s the real economics if
you want to deliver a good product.

I think that automated tests can be very valuable for maintainability,
making sure that you or other devs don''t break something down the
line. But these benefits must be worth the time (and general
inconvenience) spent on adding/maintaining the tests.

I can assure you from experience that it always is worth the time.

If I did start doing some kind of TDD, it would be more of the ''smoke
test'' variety. Call all of the functions with various parameters, test
some common scenarios, all the ''low hanging fruit''. But don''t spend a
lot of time trying to test all possible scenarios and corner cases,
100% coverage, etc, unless I have enough time for it.

Penny wise, pound foolish. Spend the time now or spend the time later
after your client complains.

I''m going to read more on the subject (thanks to Ben for the link).
Maybe I have some misconceptions.

Perhaps just lack of experience. Read up on actual case studies.

--
D''Arcy J.M. Cain <da***@druid.net | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what''s for dinner.


On Sat, 24 May 2008 21:14:36 +0200
David <wi******@gmail.comwrote:

Is it considered to be cheating if you make a test case which always
fails with a "TODO: Make a proper test case" message?

Yes. It''s better to have the daily reminder that some code needs to be
finished.

While it is possible to describe all problems in docs, it can be very
hard to write actual test code.

It may be hard to start but once you have your framework in place it
becomes very easy.

For example: sanity tests. Functions can have tests for situations
that can never occur, or are very hard to reproduce. How do you unit
test for those?

Believe me, thousands of people reading this are remembering situations
where something that couldn''t possibly happen happened.

A few examples off the top of my head:

* Code which checks for hardware defects (pentium floating point,
memory or disk errors, etc).

* Code that checks that a file is less than 1 TB large (but you only
have 320 GB harddrives in your testing environment).

* Code which checks if the machine was rebooted over a year ago.

And so on. These I would manually test by temporarily changing
variables in the code, then changing them back. To unit test these you
would need to write mock functions and arrange for the tested code to
call them instead of the python built-ins.

Yes but the mock functions can be wrappers around the real functions
which only change the results that you are testing for.

eg: You call function MyFunc with argument X, and expect to get result Y.

MyFunc calls __private_func1, and __private_func2.

You can check in your unit test that MyFunc returns result Y, but you
shouldn''t check __private_func1 and __private_func2 directly, even if
they really should be tested (maybe they sometimes have unwanted side
effects unrelated to MyFunc''s return value).

It isn''t your job to test __private_func1 and __private_func2 unless
you are writing MyFunc.

Depends on the type of bug. If it''s a bug which breaks the unit tests,
then it can be found quickly. Unit tests won''t help with bugs they
don''t explicitly cover. eg off-by-one, memory leaks, CPU load,
side-effects (outside what the unit tests test), and so on.

No but when you find that your code breaks due to these problems that''s
when you write new unit tests.

But once you track down problems like the above you can write more
unit tests to catch those exact bugs in the future. This is one case
where I do favour unit tests.

Yes! One of the biggest advantages to unit testing is that you never
ever deliver the same bug to the client twice. Delivering software
with a bug is bad but delivering it with the same bug after it was
reported and fixed is calamitous.

--
D''Arcy J.M. Cain <da***@druid.net | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what''s for dinner.


这篇关于代码正确性和测试策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆