我应该使用真实数据还是样本数据进行单元测试? [英] Should I use real or sample data for unit tests?

查看:22
本文介绍了我应该使用真实数据还是样本数据进行单元测试?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为遗留应用程序的输出编写一个解析器,由于没有关于文件语法的规范,我尽可能多地收集了这些文件的样本.

I'm writing a parser for the output of a legacy application, and since there are no specs on the file syntax I've got as many samples of these files as I could.

现在我在实现解析器之前编写单元测试(因为没有其他合理的方法可以做到这一点)但我不确定我是否应该:

Now I'm writing the unit tests before implementing the parser (because there is no other sane way to do this) but I'm not sure whether I should:

  • 使用应用程序生成的真实文件,从中读取并将输出与我将在另一个文件中以 json 格式存储的输出进行比较.
  • 或者创建一个包含我想要测试的标记和可能性的示例字符串以及一个具有预期输出的 dict(这是 python).

我倾向于使用第二种选择,因为我只会测试我需要的东西,而没有包含在实际文件中的所有真实世界"数据,但恐怕我会忘记测试一种可能性或其他.

I'm inclined to use the second alternative because I would test only what I need to, without all the "real-world" data included on the actual files, but I'm afraid I could forget to test for one possibility or another.

你怎么看?

推荐答案

我的建议是两者都做.编写一组集成测试,运行所有具有预期输出的文件,然后使用预期输入进行单元测试以隔离解析逻辑.

My suggestion is to do both. Write a set of integration tests that run through all the files you have with the expected outputs then unit test with your expected inputs to isolate the parsing logic.

我建议您先编写集成测试,这样您就可以在外部编写解析器,看到一堆失败的测试可能会令人不快,但它会帮助您更早地隔离边缘情况.

I would recommend writing the integration tests first so you write your parser outside in, it might be disparaging to see a bunch of failing tests, but it'll help you isolate your edge cases earlier.

顺便说一句,我认为这是一个很好的问题.我最近遇到了类似的问题,即将大型 xml 提要从上游系统转换为专有格式.我的解决方案是为完整提要编写一组集成黑盒测试,测试记录计数和其他高级成功指标等内容,然后将输入分解成越来越小的块,直到我能够测试数据的所有排列.直到那时我才对如何构建解析器有了很好的了解.

Btw, I think this is a great question. I recently came across something a similar problem which was transforming large xml feeds from an upstream system into a proprietary format. My solution was to write a set of integration black box tests for the full feeds testing things like record counts and other high level success metrics, then break down inputs into smaller and smaller chunks until I was able to test all the permutations of the data. It was only then that I had a good understanding of how to build the parser.

这篇关于我应该使用真实数据还是样本数据进行单元测试?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆