根据正则表达式拆分字符串,但保留定界符 [英] Split string based on regex but keep delimiters

查看:106
本文介绍了根据正则表达式拆分字符串,但保留定界符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用各种字符作为分隔符来分割字符串,并将这些分隔符保留在自己的数组索引中。例如,说我想分割字符串:

I'm trying to split a string using a variety of characters as delimiters and also keep those delimiters in their own array index. For example say I want to split the string:


if(x> 1)return x * fact(x-1);

if (x>1) return x * fact(x-1);

使用'(','>',')','*','-',';'和'\s '作为分隔符。我希望输出为以下字符串数组:{ if,(, x,>, 1,), return, x, *, fact ,(, x,-, 1,),;}

using '(', '>', ')', '*', '-', ';' and '\s' as delimiters. I want the output to be the following string array: {"if", "(", "x", ">", "1", ")", "return", "x", "*", "fact", "(", "x", "-", "1", ")", ";"}

我到目前为止使用的正则表达式是
split((?=(\\w +(?= [\\s\\ + \\-\\ * /&( < =)>(> =)(==)(!=)= ;, \\.\ \\(\\)\\ [\\] \\ {\\}])))))))

The regex I'm using so far is split("(?=(\\w+(?=[\\s\\+\\-\\*/<(<=)>(>=)(==)(!=)=;,\\.\"\\(\\)\\[\\]\\{\\}])))")

会在每个单词字符处拆分,无论是否跟随

which splits at each word character regardless of whether it is followed by one of the delimiters. For example


test + 1

test + 1

输出{ t, e, s, t +, 1}而不是{ test +, 1}

outputs {"t","e","s","t+","1"} instead of {"test+", "1"}

为什么即使每个字符后没有我的定界符,它也会在每个字符处分割呢?还有一个正则表达式,它甚至可以在Java中实现吗?
谢谢

Why does it split at each character even if that character is not followed by one of my delimiters? Also is a regex which does this even possible in Java? Thank you

推荐答案

好,您可以使用环视法在字符之间的点处分割,而无需使用分隔符:

Well, you can use lookaround to split at points between characters without consuming the delimiters:

(?<=[()>*-;\s])|(?=[()>*-;\s])

这将在每个定界符前后创建一个分割点字符。不过,您可能需要从结果数组中删除多余的空白元素。

This will create a split point before and after each delimiter character. You might need to remove superfluous whitespace elements from the resulting array, though.

快速PowerShell测试( | 标记为拆分点):

Quick PowerShell test (| marks the split points):

PS Home:\> 'if (x>1) return x * fact(x-1);' -split '(?<=[()>*-;\s])|(?=[()>*-;\s])' -join '|'
if| |(|x|>|1|)| |return| |x| |*| |fact|(|x|-|1|)|;|

这篇关于根据正则表达式拆分字符串,但保留定界符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆