将字符串拆分为重复的字符 [英] Split string into repeated characters
问题描述
我想将字符串"aaaabbbccccaaddddcfggghhhh"拆分为"aaaa","bbb"和"cccc". "aa","dddd","c","f"等.
I want to split the string "aaaabbbccccaaddddcfggghhhh" into "aaaa", "bbb", "cccc". "aa", "dddd", "c", "f" and so on.
我尝试过:
String[] arr = "aaaabbbccccaaddddcfggghhhh".split("(.)(?!\\1)");
但是这会吃掉一个字符,因此使用上面的正则表达式我会得到"aaa",而我希望将其作为"aaaa"作为第一个字符串.
But this eats away one character, so with the above regular expression I get "aaa" while I want it to be "aaaa" as the first string.
我该如何实现?
推荐答案
尝试一下:
String str = "aaaabbbccccaaddddcfggghhhh";
String[] out = str.split("(?<=(.))(?!\\1)");
System.out.println(Arrays.toString(out));
=> [aaaa, bbb, cccc, aa, dddd, c, f, ggg, hhhh]
说明:我们想将字符串分成相同字符的组,因此我们需要找出每个组之间的边界".我正在使用Java的语法进行正向查找以选择前一个字符,然后使用负向查找与后向引用来验证下一个字符是否与前一个字符不同.实际上没有消耗任何字符,因为仅使用了两个环顾断言(即,常规表达式为零宽度).
Explanation: we want to split the string at groups of same chars, so we need to find out the "boundary" between each group. I'm using Java's syntax for positive look-behind to pick the previous char and then a negative look-ahead with a back reference to verify that the next char is not the same as the previous one. No characters were actually consumed, because only two look-around assertions were used (that is, the regular expresion is zero-width).
这篇关于将字符串拆分为重复的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!