如何在忽略引号之间的逗号的同时拆分(',')字符串? [英] How can I Split(',') a string while ignore commas in between quotes?

查看:33
本文介绍了如何在忽略引号之间的逗号的同时拆分(',')字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在字符串上使用 .Split(',') 方法,该字符串的值以逗号分隔,我希望将这些值分开并放入 string[] 对象.这适用于这样的字符串:

I am using the .Split(',') method on a string that I know has values delimited by commas and I want those values to be separated and put into a string[] object. This works great for strings like this:

78,969.82,GW440,.

但是当第二个值超过 1000 时,这些值开始看起来不同,就像在这个例子中发现的那样:

But the values start to look different when that second value goes over 1000, like the one found in this example:

79,"1,013.42",GW450,....

这些值来自电子表格控件,我在其中使用了 ExportToCsv(...) 方法中内置的控件,这解释了为什么实际数值的格式化版本.

These values are coming from a spreadsheet control where I use the controls built in ExportToCsv(...) method and that explains why a formatted version of the actual numerical value.

有没有办法让 .Split(',') 方法忽略引号内的逗号?我实际上不希望将值 "1,013.42" 拆分为 "1013.42".

Is there a way I can get the .Split(',') method to ignore commas inside of quotes? I don't actually want the value "1,013.42" to be split up as "1 and 013.42".

有什么想法吗?谢谢!

我真的很想在不合并第 3 方工具的情况下执行此操作,因为我的用例确实不涉及除此之外的许多其他情况,即使它是我工作解决方案的一部分,但合并了这样的工具并没有目前没有真正使任何人受益.我希望有一些东西可以快速解决我遗漏的这个特定用例,但现在是周末,我会看看我是否不能在周一对这个问题再提供一次更新,解决方案我最终来了跟上.到目前为止,感谢大家的帮助,我将在周一进一步评估每个答案.

I really would like to do this without incorporating a 3rd party tool as my use case really doesn't involve many other cases besides this one and even though it is part of my work's solution, having a tool like that incorporated doesn't really benefit anyone at the moment. I was hoping there was something quick to solve this particular use case that I was missing, but now that it is the weekend, I'll see if I can't give one more update to this question on Monday with the solution I eventually come up with. Thank you everyone for you assistance so far, I'll will assess each answer further on Monday.

推荐答案

这是一个相当简单的 CSV 阅读器实现,我们在这里的一些项目中使用.易于使用并处理您所说的那些情况.

This is a fairly straight forward CSV Reader implementation we use in a few projects here. Easy to use and handles those cases you are talking about.

首先是 CSV 类

public static class Csv
{
    public static string Escape(string s)
    {
        if (s.Contains(QUOTE))
            s = s.Replace(QUOTE, ESCAPED_QUOTE);

        if (s.IndexOfAny(CHARACTERS_THAT_MUST_BE_QUOTED) > -1)
            s = QUOTE + s + QUOTE;

        return s;
    }

    public static string Unescape(string s)
    {
        if (s.StartsWith(QUOTE) && s.EndsWith(QUOTE))
        {
            s = s.Substring(1, s.Length - 2);

            if (s.Contains(ESCAPED_QUOTE))
                s = s.Replace(ESCAPED_QUOTE, QUOTE);
        }

        return s;
    }


    private const string QUOTE = """;
    private const string ESCAPED_QUOTE = """";
    private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '
' };

}

然后是一个非常好的 Reader 实现 - 如果您需要它.您应该能够仅使用上面的 CSV 类来完成您需要的操作.

Then a pretty nice Reader implementation - If you need it. You should be able to do what you need with just the CSV class above.

public sealed class CsvReader : System.IDisposable
{
    public CsvReader(string fileName)
        : this(new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
    }

    public CsvReader(Stream stream)
    {
        __reader = new StreamReader(stream);
    }

    public System.Collections.IEnumerable RowEnumerator
    {
        get
        {
            if (null == __reader)
                throw new System.ApplicationException("I can't start reading without CSV input.");

            __rowno = 0;
            string sLine;
            string sNextLine;

            while (null != (sLine = __reader.ReadLine()))
            {
                while (rexRunOnLine.IsMatch(sLine) && null != (sNextLine = __reader.ReadLine()))
                    sLine += "
" + sNextLine;

                __rowno++;
                string[] values = rexCsvSplitter.Split(sLine);

                for (int i = 0; i < values.Length; i++)
                    values[i] = Csv.Unescape(values[i]);

                yield return values;
            }

            __reader.Close();
        }

    }

    public long RowIndex { get { return __rowno; } }

    public void Dispose()
    {
        if (null != __reader) __reader.Dispose();
    }

    //============================================


    private long __rowno = 0;
    private TextReader __reader;
    private static Regex rexCsvSplitter = new Regex(@",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))");
    private static Regex rexRunOnLine = new Regex(@"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$");

}

那么你就可以这样使用了.

Then you can use it like this.

var reader = new CsvReader(new FileStream(file, FileMode.Open));

注意:这将打开一个现有的 CSV 文件,但可以很容易地修改以获取您需要的 string[].

Note: This would open an existing CSV file, but can be modified fairly easily to take a string[] like you need.

这篇关于如何在忽略引号之间的逗号的同时拆分(',')字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆