你如何对正则表达式进行单元测试? [英] How do you unit test regular expressions?

查看:52
本文介绍了你如何对正则表达式进行单元测试?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 TDD 的新手,我发现 RegExp 是一个非常特殊的情况.有什么特殊的方法可以对它们进行单元测试,还是可以将它们视为常规函数?

I'm new to TDD, and I find RegExp quite a particular case. Is there any special way to unit test them, or may I just treat them as regular functions?

推荐答案

您应该始终测试您的正则表达式,就像任何其他代码块一样.它们至多是一个接受字符串并返回 bool 或返回值数组的函数.

You should always test your regexen, much like any other chunk of code. They're at the most simple a function that takes a string and returns a bool, or returns an array of values.

这里有一些关于在为正则表达式设计单元测试时应该考虑什么的建议.这些不是单元测试设计的硬性规定,而是一些塑造你的思维的指导方针.与往常一样,权衡您的测试需求与失败成本以及实施它们所需的时间.(我发现实施"测试是最简单的部分!:-])

Here are some suggestions on what to think about when it comes to designing unit tests for regexen. These are not not hard and fast prescriptions for unit test design, but some guidelines to shape your thinking. As always, weigh the needs of your testing versus cost of failure balanced with the time required to implement them all. (I find that 'implementing' the test is the easy part! :-] )

需要考虑的要点:

  • 将每个组(括号)想象成一个花括号.
  • 想想每一个 |作为条件.确保对每个分支进行测试.
  • 将每个修饰符(*、+、?)视为不同的路径.
  • (上面的旁注:记住 *、+、? 和 *?、+? 和 ?? 之间的区别.)
  • 对于\d、\s、\w 和它们的否定,在每个范围内尝试几个.
  • 对于 * 和 +,您需要分别测试无值"、其中之一"和一个或多个".
  • 对于重要的控制"字符(例如,您要查找的正则表达式中的字符串)进行测试,看看如果它们出现在错误的位置会发生什么.这可能会让您大吃一惊.
  • 如果您拥有真实世界的数据,请尽可能多地使用它.
  • 如果没有,请确保测试应该有效的简单和复杂表单.
  • 确保测试插入时正则表达式控制字符的作用.
  • 确保确认空字符串被正确接受/拒绝.
  • 确保验证每个不同类型的空格字符的字符串是否被正确接受或拒绝.
  • 确保正确处理不区分大小写的问题(i 标志).在文本解析(空格除外)中,这几乎比其他任何东西都让我咬牙切齿.
  • 如果您有 x、m 或 s 选项,请确保您了解它们的作用并对其进行测试(此处的行为可能有所不同)

对于返回列表的正则表达式,还要记住:

For a regex that returns lists, also remember:

  • 验证您期望的数据是否以正确的顺序在正确的字段中返回.
  • 验证轻微的修改不会返回好的数据.
  • 验证混合匿名组和命名组是否正确解析(例如,(?<name> thing1 ( thing2) )) - 根据您使用的正则表达式引擎,此行为可能会有所不同.
  • 再一次,进行大量真实世界的试验.
  • Verify that the data you expect is returned, in the right order, in the right fields.
  • Verify that slight modifications do not return good data.
  • Verify that mixed anonymous groups and named groups parse correctly (eg, (?<name> thing1 ( thing2) )) - this behavior can be different based on the regex engine you're using.
  • Once again, give lots of real world trials.

如果您使用任何高级功能,例如非回溯组,请确保您完全了解该功能的工作原理,并使用上述指南构建适合和反对每个功能的示例字符串.

If you use any advanced features, such as non-backtracking groups, make sure you understand completely how the feature works, and using the guidelines above, build example strings that should work for and against each of them.

根据您的正则表达式库实现,捕获组的方式也可能不同.Perl 5 有一个'open paren order' 排序,C# 有一部分除了命名组等等.请务必尝试您的口味,以准确了解它的作用.

Depending on your regex library implementation, the way groups are captured may be different as well. Perl 5 has a 'open paren order' ordering, C# has that partially except for named groups and so on. Make sure to experiment with your flavor to know exactly what it does.

然后,将它们与您的其他单元测试直接集成,无论是在它们自己的模块中还是在包含正则表达式的模块旁边.对于特别讨厌的正则表达式,您可能会发现您需要进行大量测试来验证您使用的模式和所有功能是否正确.如果正则表达式构成了该方法所做的大部分(或几乎所有)工作,我将使用上面的建议来设计输入来测试该函数,而不是直接测试正则表达式.这样,如果以后您认为正则表达式不可行,或者您想将其分解,您可以在不更改接口的情况下捕获正则表达式提供的行为 - 即调用正则表达式的方法.

Then, integrate them right in with your other unit tests, either in their own module or alongside the module that contains the regex. For particularly nasty regexen, you may find you need lots and lots of tests to verify that the pattern and all the features you use are correct. If the regex makes up a large (or nearly all) of the work that the method is doing, I will use the advice above to fashion inputs to test that function and not the regex directly. That way, if later you decide that the regex is not the way to go, or you want to break it up, you can capture the behavior the regex provided without changing the interface - ie, the method that invokes the regex.

只要你真的知道一个正则表达式功能应该如何在你的正则表达式风格中工作,你就应该能够为它开发合适的测试用例.只要确保您真的、真的、真的了解该功能的工作原理!

As long as you really know how a regex feature is supposed to work in your flavor of regex, you should be able to develop decent test cases for it. Just make sure you really, really, really do understand how the feature works!

这篇关于你如何对正则表达式进行单元测试?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆