拆分正则表达式以提取连续字符的字符串 [英] Split regex to extract Strings of contiguous characters

查看:143
本文介绍了拆分正则表达式以提取连续字符的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有适用于 String.split() 将字符串分成连续的字符 - 即分割下一个字符与前一个字符不同的位置?

Is there a regex that would work with String.split() to break a String into contiguous characters - ie split where the next character is different to the previous character?

以下是测试用例:

    String regex = "your answer here";
    String[] parts = "aaabbcddeee".split(regex);
    System.out.println(Arrays.toString(parts));

预期产出:

[aaa, bb, c, dd, eee]

虽然测试用例只有字母作为输入,这只是为了清晰起见;输入字符可能是任何字符。

Although the test case has letters only as input, this is for clarity only; input characters may be any character.

请不要提供解决方法循环或其他技术。

Please do not provide "work-arounds" involving loops or other techniques.

问题是为代码找到正确的正则表达式,如上所示 - 即仅使用 split(),没有其他方法调用。这不是一个关于找到能够完成工作的代码的问题。

The question is to find the right regex for the code as shown above - ie only using split() and no other methods calls. It is not a question about finding code that will "do the job".

推荐答案

完全可以编写正则表达式分步一步:

It is totally possible to write the regex for splitting in one step:

"(?<=(.))(?!\\1)"

由于你想要在每组相同的字符之间进行拆分,我们只需要寻找2之间的边界。组。我通过使用正面的后视来获取前一个字符,并使用负前瞻和后引用来检查下一个字符是不是相同的字符。

Since you want to split between every group of same characters, we just need to look for the boundary between 2 groups. I achieve this by using a positive look-behind just to grab the previous character, and use a negative look-ahead and back-reference to check that the next character is not the same character.

正如您所看到的,正则表达式是零宽度(只有2个查看断言)。正则表达式不会消耗任何字符。

As you can see, the regex is zero-width (only 2 look around assertions). No character is consumed by the regex.

这篇关于拆分正则表达式以提取连续字符的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆