如何使用正则表达式用逗号分隔除括号内的字符串外的字符串? [英] How do I split a string by commas except inside parenthesis, using a regular expression?

查看:43
本文介绍了如何使用正则表达式用逗号分隔除括号内的字符串外的字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用逗号分割一个字符串:

I want to split a string by comma:

"a,s".split ','  # => ['a', 's']

如果子字符串被括号括起来,我不想拆分它:

I don't want to split a sub-string if it is wrapped by parenthesis:

"a,s(d,f),g,h"

应该产生:

['a', 's(d,f)', 'g', 'h']

有什么建议吗?

推荐答案

处理嵌套括号,可以使用:

To deal with nested parenthesis, you can use:

txt = "a,s(d,f(4,5)),g,h"
pattern = Regexp.new('((?:[^,(]+|(\((?>[^()]+|\g<-1>)*\)))+)')
puts txt.scan(pattern).map &:first

图案详情:

(                        # first capturing group
    (?:                  # open a non capturing group
        [^,(]+           # all characters except , and (
      |                  # or
        (                # open the second capturing group
           \(            # (
            (?>          # open an atomic group
                [^()]+   # all characters except parenthesis
              |          # OR
                \g<-1>   # the last capturing group (you can also write \g<2>)
            )*           # close the atomic group
            \)           # )
        )                # close the second capturing group
    )+                   # close the non-capturing group and repeat it
)                        # close the first capturing group

第二个捕获组描述嵌套括号,可以包含不是括号的字符或捕获组本身.这是一个递归模式.

The second capturing group describe the nested parenthesis that can contain characters that are not parenthesis or the capturing group itself. It's a recursive pattern.

在模式内部,可以用他的编号(\g<2>为第二个捕获组)或他的相对位置(\g<-1>; 模式中当前位置左侧的第一个)(如果使用命名捕获组,则使用他的名字)

Inside the pattern, you can refer to a capture group with his number (\g<2> for the second capturing group) or with his relative position (\g<-1> the first on the left from the current position in the pattern) (or with his name if you use named capturing groups)

注意:如果在非捕获组末尾添加|[()],则可以允许使用单括号.然后 a,b(,c 会给你 ['a', 'b(', 'c']

Notice: You can allow single parenthesis if you add |[()] before the end of the non-capturing group. Then a,b(,c will give you ['a', 'b(', 'c']

这篇关于如何使用正则表达式用逗号分隔除括号内的字符串外的字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆