拆分逗号分隔的字符串,忽略引号中的逗号,但允许字符串带有一个双引号 [英] Splitting comma separated string, ignore commas in quotes, but allow strings with one double quotation

查看:98
本文介绍了拆分逗号分隔的字符串,忽略引号中的逗号,但允许字符串带有一个双引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 stackoverflow 上搜索了几篇关于如何在逗号分隔符上拆分字符串的帖子,但忽略了在引号中拆分逗号(请参阅:如何用逗号将字符串拆分为数组但忽略双引号内的逗号?) 我试图获得类似的结果,但还需要允许包含一个双引号的字符串.

I have searched through several posts on stackoverflow on how to split a string on comma delimiter, but ignore splitting on comma in quotes (see: How do I split a string into an array by comma but ignore commas inside double quotes?) I am trying to achieve just similar results, but need to also allow for a string that contains one double quote.

IE.需要"test05, \"test, 05\", test\", test 05" 拆分成

  • test05
  • 测试,05"
  • 测试"
  • 测试05

我尝试了一种与这里提到的方法类似的方法:

I tried a similar method to one mentioned here:

正则表达式用于在没有被单引号或双引号包围的情况下使用空格分割字符串

使用 Matcher,而不是 split().然而,具体的例子它在空格上分割,而不是在逗号上.我试图调整模式以考虑逗号,但没有任何运气.

Using Matcher, instead of split(). however, that specific examples it splits on spaces, and not on commas. I've tried to adjust the pattern to account for commas, instead, but have not had any luck.

String str = "test05, \"test, 05\", test\", test 05";
str = str + " "; // add trailing space
int len = str.length();
Matcher m = Pattern.compile("((\"[^\"]+?\")|([^,]+?)),++").matcher(str);

for (int i = 0; i < len; i++)
{
    m.region(i, len);

    if (m.lookingAt())
    {
        String s = m.group(1);

        if ((s.startsWith("\"") && s.endsWith("\"")))
        {
            s = s.substring(1, s.length() - 1);
        }

        System.out.println(i + ": \"" + s + "\"");
        i += (m.group(0).length() - 1);
    }
}

推荐答案

我也遇到过类似的问题,我没有找到好的 .net 解决方案,所以开始 DIY.

I've had similar issues with this, and I've found no good .net solution so went DIY.

在我的应用程序中,我正在解析一个 csv,因此我的拆分凭据是,".我想这个方法只适用于你有单个字符分割参数的地方.

In my application I'm parsing a csv so my split credential is ",". this method I suppose only works for where you have a single char split argument.

所以,我编写了一个忽略双引号内逗号的函数.它通过将输入字符串转换为字符数组并按字符解析字符来实现

So, I've written a function that ignores commas within double quotes. it does it by converting the input string into a character array and parsing char by char

public static string[] Splitter_IgnoreQuotes(string stringToSplit)
    {   
        char[] CharsOfData = stringToSplit.ToCharArray();
        //enter your expected array size here or alloc.
        string[] dataArray = new string[37];
        int arrayIndex = 0;
        bool DoubleQuotesJustSeen = false;          
        foreach (char theChar in CharsOfData)
        {
            //did we just see double quotes, and no command? dont split then. you could make ',' a variable for your split parameters I'm working with a csv.
            if ((theChar != ',' || DoubleQuotesJustSeen) && theChar != '"')
            {
                dataArray[arrayIndex] = dataArray[arrayIndex] + theChar;
            }
            else if (theChar == '"')
            {
                if (DoubleQuotesJustSeen)
                {
                    DoubleQuotesJustSeen = false;
                }
                else
                {
                    DoubleQuotesJustSeen = true;
                }
            }
            else if (theChar == ',' && !DoubleQuotesJustSeen)
            {
                arrayIndex++;
            }
        }
        return dataArray;
    }

根据我的应用程序口味,此函数也会忽略任何输入中的 (""),因为这些是不需要的并且存在于我的输入中.

This function, to my application taste also ignores ("") in any input as these are unneeded and present in my input.

这篇关于拆分逗号分隔的字符串,忽略引号中的逗号,但允许字符串带有一个双引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆