如何获取在搜索到的字词后出现的逗号分隔值? [英] How can I grab comma delimited values that appear after a term I searched for?
问题描述
到目前为止,这是我的代码:
Here's my code so far:
public void DeserialStream(string filePath)
{
using (StreamReader sr = new StreamReader(filePath))
{
string currentline;
while ((currentline = sr.ReadLine()) != null)
{
if (currentline.IndexOf("Count", StringComparison.CurrentCultureIgnoreCase) >= 0)
{
Console.WriteLine(currentline);
}
}
}
}
我想知道如何获取在搜索到的字词后出现的逗号分隔值?
I was wondering how can I grab comma delimited values that appear after a term I searched for?
就像我的csv包含以下信息一样:
Like if I a csv that contained this info:
"Date","dd/mm/yyyy"
"ExpirationDate","dd/mm/yyyy"
"DataType","Count"
"Location","Unknown","Variable1","Variable2","Variable3"
"A(Loc3, Loc4)","Unknown","5656","787","42"
"A(Loc5, Loc6)","Unknown","25","878","921"
"DataType","Net"
"Location","Unknown","Variable1","Variable2","Variable3"
"A(Loc3, Loc4)","Unknown","5656","787","42"
"A(Loc5, Loc6)","Unknown","25","878","921"
但是我要如何抓住在Count之后但在Net之前的值?
But how would I grab the table of values after Count but before Net?
也就是说,我想要解析的只是括号中的数据:
That is, only the data is brackets is what I want to parse:
"Date","dd/mm/yyyy"
"ExpirationDate","dd/mm/yyyy"
"DataType","Count"
[ "Location","Unknown","Variable1","Variable2","Variable3"
"A(Loc3, Loc4)","Unknown","5656","787","42"
"A(Loc5, Loc6)","Unknown","25","878","921"]
"DataType","Net"
"Location","Unknown","Variable1","Variable2","Variable3"
"A(Loc3, Loc4)","Unknown","5656","787","42"
"A(Loc5, Loc6)","Unknown","25","878","921"
我在想也许应该使用正则表达式,或者使用上述方法是否更简单?
I was thinking maybe I should use a regular expression or is there an easier way using the method above?
推荐答案
您可以使用LINQ:
List<string> lines = File.ReadLines(path)
.SkipWhile(l => l.IndexOf("\"Count\"", StringComparison.InvariantCultureIgnoreCase) == -1)
.Skip(1) // skip the "Count"-line
.TakeWhile(l => l.IndexOf("\"Net\"", StringComparison.InvariantCultureIgnoreCase) == -1)
.ToList();
使用 String.Split
获取<每行code> string [] 。通常,我会使用可用的CSV解析器来处理边缘情况和不良数据,而不是
Use String.Split
to get a string[]
for every line. In general i would use an available CSV parser which handle edge cases and bad data instead of reinventing the wheel.
编辑:如果要将字段拆分为 List< string>
您应该使用如上所述的CSV解析器,因为您的数据已经使用了引号字符,因此用
包裹的逗号不应分割。
Edit: If you want to split the fields into a List<string>
you should use a CSV parser as mentioned since your data already uses a quoting character, so commas wrapped in "
should not be splitted.
但是,这是使用 StringBuilder
的另一种简单而有效的方法:
However, here is another simple but efficient approach using a StringBuilder
:
public static IEnumerable<string> SplitCSV(string csvString)
{
var sb = new StringBuilder();
bool quoted = false;
foreach (char c in csvString)
{
if (quoted)
{
if (c == '"')
quoted = false;
else
sb.Append(c);
}
else
{
if (c == '"')
{
quoted = true;
}
else if (c == ',')
{
yield return sb.ToString();
sb.Length = 0;
}
else
{
sb.Append(c);
}
}
}
if (quoted)
throw new ArgumentException("csvString", "Unterminated quotation mark.");
yield return sb.ToString();
}
(感谢 https://stackoverflow.com/a/4150727/284240 )
现在,您可以在上面的查询中使用 SelectMany
来展平所有令牌:
Now you can use SelectMany
in the query above to flatten out all tokens:
List<string> allTokens = File.ReadLines(path)
.SkipWhile(l => l.IndexOf("\"Count\"", StringComparison.InvariantCultureIgnoreCase) == -1)
.Skip(1) // skip the "Count"-line
.TakeWhile(l => l.IndexOf("\"Net\"", StringComparison.InvariantCultureIgnoreCase) == -1)
.SelectMany(l => SplitCSV(l.Trim()))
.ToList();
结果:
Location, Unknown, Variable1, Variable2, Variable3, A(Loc3, Loc4), Unknown, 5656, 787, 42, A(Loc5, Loc6), Unknown, 25, 878, 921, ""
这篇关于如何获取在搜索到的字词后出现的逗号分隔值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!