codingBat 使用正则表达式(和单元测试方法)将数千个分隔开 [英] codingBat separateThousands using regex (and unit testing how-to)

查看：26 发布时间：2021/9/14 19:14:04 java regex unit-testing

本文介绍了codingBat 使用正则表达式(和单元测试方法)将数千个分隔开的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

本题结合了正则表达式练习和单元测试练习.

This question is a combination of regex practice and unit testing practice.

我编写了这个问题 separateThousands 用于个人练习:

I authored this problem separateThousands for personal practice:

给定一个字符串形式的数字，引入逗号来分隔千位.该数字可能包含一个可选的减号和一个可选的小数部分.不会有任何多余的前导零.

Given a number as a string, introduce commas to separate thousands. The number may contain an optional minus sign, and an optional decimal part. There will not be any superfluous leading zeroes.

这是我的解决方案:

String separateThousands(String s) {
  return s.replaceAll(
      String.format("(?:%s)|(?:%s)",
        "(?<=\\G\\d{3})(?=\\d)",
        "(?<=^-?\\d{1,3})(?=(?:\\d{3})+(?!\\d))"
      ),
      ","
  );
}

它的工作方式是对两种类型的逗号进行分类，first 和 rest.在上面的正则表达式中，rest 子模式实际上出现在 first 之前.匹配项将始终为零长度，即 replaceAll 与 ",".

The way it works is that it classifies two types of commas, the first, and the rest. In the above regex, the rest subpattern actually appears before the first. A match will always be zero-length, which will be replaceAll with ",".

rest 基本上是向后看是否有匹配后跟 3 个数字，然后向前看是否有数字.这是上一场比赛触发的某种连锁反应机制.

The rest basically looks behind to see if there was a match followed by 3 digits, and looks ahead to see if there's a digit. It's some sort of a chain reaction mechanism triggered by the previous match.

first 基本上是在后面寻找 ^ 锚点，后跟一个可选的减号，以及 1 到 3 位数字.从该点开始的字符串的其余部分必须匹配数字的三元组，后跟一个非数字(可以是 $ 或 \.).

The first basically looks behind for ^ anchor, followed by an optional minus sign, and between 1 to 3 digits. The rest of the string from that point must match triplets of digits, followed by a nondigit (which could either be $ or \.).

我对这部分的问题是:

这个正则表达式可以简化吗?
能否进一步优化?
- 在first之前订购rest是故意的，因为first只需要一次
- 没有捕获组
- Can this regex be simplified?
- Can it be optimized further?
  - Ordering rest before first is deliberate, since first is only needed once
  - No capturing group
  正如我所提到的，我是这个问题的作者，所以我也是负责为他们提出测试用例的人.他们在这里:
  
  As I've mentioned, I'm the author of this problem, so I'm also the one responsible for coming up with testcases for them. Here they are:
```
INPUT, OUTPUT
"1000", "1,000"
"-12345", "-12,345"
"-1234567890.1234567890", "-1,234,567,890.1234567890"
"123.456", "123.456"
".666666", ".666666"
"0", "0"
"123456789", "123,456,789"
"1234.5678", "1,234.5678"
"-55555.55555", "-55,555.55555"
"0.123456789", "0.123456789"
"123456.789", "123,456.789"
```
  我在工业强度单元测试方面没有太多经验，所以我想知道其他人是否可以评论这是否是一个很好的覆盖范围，我是否遗漏了任何重要的东西，等等(我总是可以添加更多测试，如果我错过了一个场景).
  
  I haven't had much experience with industrial-strength unit testing, so I'm wondering if others can comment whether this is a good coverage, whether I've missed anything important, etc (I can always add more tests if there's a scenario I've missed).
  
  推荐答案
  
  这对我有用:
```
return s.replaceAll("(\\G-?\\d{1,3})(?=(?:\\d{3})++(?!\\d))", "$1,");
```
  第一次通过，\G 和 ^ 作用相同，先行迫使 \d{1,3} 消耗只需要尽可能多的字符，以将匹配位置保留在三位数边界处.之后， \d{1,3} 每次最多消耗三位数字，使用 \G 将其锚定到上一场比赛的末尾.
  
  The first time through, \G acts the same as ^, and the lookahead forces \d{1,3} to consume only as many characters as necessary to leave the match position at a three-digit boundary. After that, \d{1,3} consumes the maximum three digits every time, with \G to keep it anchored to the end of the previous match.
  
  至于你的单元测试，我只是在问题描述中明确说明输入将始终是有效数字，最多有一个小数点.
  
  As for your unit tests, I would just make it clear in the problem description that the input will always be valid number, with at most one decimal point.
  
  这篇关于codingBat 使用正则表达式(和单元测试方法)将数千个分隔开的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

codingBat 使用正则表达式(和单元测试方法)将数千个分隔开 [英] codingBat separateThousands using regex (and unit testing how-to)

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

codingBat 使用正则表达式(和单元测试方法)将数千个分隔开 [英] codingBat separateThousands using regex (and unit testing how-to)

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭