如何在忽略引号之间的逗号的同时拆分(',')字符串? [英] How can I Split(',') a string while ignore commas in between quotes?
问题描述
我在字符串上使用 .Split(',')
方法,该字符串的值以逗号分隔,我希望将这些值分开并放入 string[]
对象.这适用于这样的字符串:
I am using the .Split(',')
method on a string that I know has values delimited by commas and I want those values to be separated and put into a string[]
object. This works great for strings like this:
78,969.82,GW440,
.
但是当第二个值超过 1000 时,这些值开始看起来不同,就像在这个例子中发现的那样:
But the values start to look different when that second value goes over 1000, like the one found in this example:
79,"1,013.42",GW450,...
.
这些值来自电子表格控件,我在其中使用了 ExportToCsv(...)
方法中内置的控件,这解释了为什么实际数值的格式化版本.
These values are coming from a spreadsheet control where I use the controls built in ExportToCsv(...)
method and that explains why a formatted version of the actual numerical value.
有没有办法让 .Split(',')
方法忽略引号内的逗号?我实际上不希望将值 "1,013.42"
拆分为 "1
和 013.42"
.
Is there a way I can get the .Split(',')
method to ignore commas inside of quotes? I don't actually want the value "1,013.42"
to be split up as "1
and 013.42"
.
有什么想法吗?谢谢!
我真的很想在不合并第 3 方工具的情况下执行此操作,因为我的用例确实不涉及除此之外的许多其他情况,即使它是我工作解决方案的一部分,但合并了这样的工具并没有目前没有真正使任何人受益.我希望有一些东西可以快速解决我遗漏的这个特定用例,但现在是周末,我会看看我是否不能在周一对这个问题再提供一次更新,解决方案我最终来了跟上.到目前为止,感谢大家的帮助,我将在周一进一步评估每个答案.
I really would like to do this without incorporating a 3rd party tool as my use case really doesn't involve many other cases besides this one and even though it is part of my work's solution, having a tool like that incorporated doesn't really benefit anyone at the moment. I was hoping there was something quick to solve this particular use case that I was missing, but now that it is the weekend, I'll see if I can't give one more update to this question on Monday with the solution I eventually come up with. Thank you everyone for you assistance so far, I'll will assess each answer further on Monday.
推荐答案
这是一个相当简单的 CSV 阅读器实现,我们在这里的一些项目中使用.易于使用并处理您所说的那些情况.
This is a fairly straight forward CSV Reader implementation we use in a few projects here. Easy to use and handles those cases you are talking about.
首先是 CSV 类
public static class Csv
{
public static string Escape(string s)
{
if (s.Contains(QUOTE))
s = s.Replace(QUOTE, ESCAPED_QUOTE);
if (s.IndexOfAny(CHARACTERS_THAT_MUST_BE_QUOTED) > -1)
s = QUOTE + s + QUOTE;
return s;
}
public static string Unescape(string s)
{
if (s.StartsWith(QUOTE) && s.EndsWith(QUOTE))
{
s = s.Substring(1, s.Length - 2);
if (s.Contains(ESCAPED_QUOTE))
s = s.Replace(ESCAPED_QUOTE, QUOTE);
}
return s;
}
private const string QUOTE = """;
private const string ESCAPED_QUOTE = """";
private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '
' };
}
然后是一个非常好的 Reader 实现 - 如果您需要它.您应该能够仅使用上面的 CSV 类来完成您需要的操作.
Then a pretty nice Reader implementation - If you need it. You should be able to do what you need with just the CSV class above.
public sealed class CsvReader : System.IDisposable
{
public CsvReader(string fileName)
: this(new FileStream(fileName, FileMode.Open, FileAccess.Read))
{
}
public CsvReader(Stream stream)
{
__reader = new StreamReader(stream);
}
public System.Collections.IEnumerable RowEnumerator
{
get
{
if (null == __reader)
throw new System.ApplicationException("I can't start reading without CSV input.");
__rowno = 0;
string sLine;
string sNextLine;
while (null != (sLine = __reader.ReadLine()))
{
while (rexRunOnLine.IsMatch(sLine) && null != (sNextLine = __reader.ReadLine()))
sLine += "
" + sNextLine;
__rowno++;
string[] values = rexCsvSplitter.Split(sLine);
for (int i = 0; i < values.Length; i++)
values[i] = Csv.Unescape(values[i]);
yield return values;
}
__reader.Close();
}
}
public long RowIndex { get { return __rowno; } }
public void Dispose()
{
if (null != __reader) __reader.Dispose();
}
//============================================
private long __rowno = 0;
private TextReader __reader;
private static Regex rexCsvSplitter = new Regex(@",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))");
private static Regex rexRunOnLine = new Regex(@"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$");
}
那么你就可以这样使用了.
Then you can use it like this.
var reader = new CsvReader(new FileStream(file, FileMode.Open));
注意:这将打开一个现有的 CSV 文件,但可以很容易地修改以获取您需要的 string[]
.
Note: This would open an existing CSV file, but can be modified fairly easily to take a string[]
like you need.
这篇关于如何在忽略引号之间的逗号的同时拆分(',')字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!