解析包含数组的字符串 [英] parsing of a string containing an array

查看:356
本文介绍了解析包含数组的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将包含递归字符串数组的字符串转换为深度为1的数组.

I'd like to convert string containing recursive array of strings to an array of depth one.

示例:

StringToArray("[a, b, [c, [d, e]], f, [g, h], i]") == ["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"]

似乎很简单.但是,我来自功能背景,并且对.NET Framework标准库不那么熟悉,因此每次(我从零开始,都经历了3次),最终我只会看到丑陋的代码.我的最新实现是此处.如您所见,这很丑陋.

Seems quite simple. But, I come from functional background and I'm not that familiar with .NET Framework standard libraries, so every time (I started from scratch like 3 times) I end up just plain ugly code. My latest implementation is here. As you see, it's ugly as hell.

那么,用C#做到这一点的方式是什么?

So, what's the C# way to do this?

推荐答案

@ojlovecd使用正则表达式是一个很好的答案.
但是,他的答案过于复杂,所以这是我类似的,更简单的答案.

@ojlovecd has a good answer, using Regular Expressions.
However, his answer is overly complicated, so here's my similar, simpler answer.

public string[] StringToArray(string input) {
    var pattern = new Regex(@"
        \[
            (?:
            \s*
                (?<results>(?:
                (?(open)  [^\[\]]+  |  [^\[\],]+  )
                |(?<open>\[)
                |(?<-open>\])
                )+)
                (?(open)(?!))
            ,?
            )*
        \]
    ", RegexOptions.IgnorePatternWhitespace);

    // Find the first match:
    var result = pattern.Match(input);
    if (result.Success) {
        // Extract the captured values:
        var captures = result.Groups["results"].Captures.Cast<Capture>().Select(c => c.Value).ToArray();
        return captures;
    }
    // Not a match
    return null;
}

使用此代码,您将看到StringToArray("[a, b, [c, [d, e]], f, [g, h], i]")将返回以下数组:["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"].

Using this code, you will see that StringToArray("[a, b, [c, [d, e]], f, [g, h], i]") will return the following array: ["a", "b", "[c, [d, e]]", "f", "[g, h]", "i"].

有关我用于匹配平衡括号的平衡组的更多信息,请查看 Microsoft的文档.

For more information on the balanced groups that I used for matching balanced braces, take a look at Microsoft's documentation.

更新:
根据注释,如果您还希望平衡引号,则可以进行以下修改. (请注意,在C#中,"被转义为"").我还添加了对模式的描述以帮助阐明它:

Update:
As per the comments, if you want to also balance quotes, here's a possible modification. (Note that in C# the " is escaped as "") I also added descriptions of the pattern to help clarify it:

    var pattern = new Regex(@"
        \[
            (?:
            \s*
                (?<results>(?:              # Capture everything into 'results'
                    (?(open)                # If 'open' Then
                        [^\[\]]+            #   Capture everything but brackets
                        |                   # Else (not open):
                        (?:                 #   Capture either:
                            [^\[\],'""]+    #       Unimportant characters
                            |               #   Or
                            ['""][^'""]*?['""] #    Anything between quotes
                        )  
                    )                       # End If
                    |(?<open>\[)            # Open bracket
                    |(?<-open>\])           # Close bracket
                )+)
                (?(open)(?!))               # Fail while there's an unbalanced 'open'
            ,?
            )*
        \]
    ", RegexOptions.IgnorePatternWhitespace);

这篇关于解析包含数组的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆