如何解析名称和整数列表交替的文本文件? [英] How to parse a text file with alternating lines of names and lists of integers?
问题描述
我需要读取一个文件并将该数据放入不同的数组中。
I need to read a file and put that data inside to different arrays.
我的.txt文件如下:
My .txt file looks like:
w1;
1 2 3
w2;
3 4 5
w3;
4 5 6
我尝试了以下操作:
int[] w1 = new int [3];
int[] w2 = new int [3];
int[] w3 = new int [3];
string v = "w1:|w2:|w3:";
foreach (string line in File.ReadAllLines(@"D:\\Data.txt"))
{
string[] parts = Regex.Split(line, v);
我得到了那个字符串,但是我不知道如何将它的每个元素切成上面显示的数组。
I got that string but I have no idea how to cut every element of it to arrays showed above.
推荐答案
而不是解析文件并将数组放入对应于硬编码名称 w1 <的三个硬编码变量中/ code>,
w2
和 w3
,我将删除硬编码并将文件解析为 Dictionary< string,int []>
像这样:
Rather than parsing the file and putting the arrays into three hardcoded variables corresponding to hardcoded names w1
, w2
and w3
, I would remove the hardcoding and parse the file into a Dictionary<string, int[]>
like so:
public static class DataFileExtensions
{
public static Dictionary<string, int[]> ParseDataFile(string fileName)
{
var separators = new [] { ' ' };
var query = from pair in File.ReadLines(fileName).Chunk(2)
let key = pair[0].TrimEnd(';')
let value = (pair.Count < 2 ? "" : pair[1]).Split(separators, StringSplitOptions.RemoveEmptyEntries).Select(s => int.Parse(s, NumberFormatInfo.InvariantInfo)).ToArray()
select new { key, value };
return query.ToDictionary(p => p.key, p => p.value);
}
}
public static class EnumerableExtensions
{
// Adapted from the answer to "Split List into Sublists with LINQ" by casperOne
// https://stackoverflow.com/questions/419019/split-list-into-sublists-with-linq/
// https://stackoverflow.com/a/419058
// https://stackoverflow.com/users/50776/casperone
public static IEnumerable<List<T>> Chunk<T>(this IEnumerable<T> enumerable, int groupSize)
{
// The list to return.
List<T> list = new List<T>(groupSize);
// Cycle through all of the items.
foreach (T item in enumerable)
{
// Add the item.
list.Add(item);
// If the list has the number of elements, return that.
if (list.Count == groupSize)
{
// Return the list.
yield return list;
// Set the list to a new list.
list = new List<T>(groupSize);
}
}
// Return the remainder if there is any,
if (list.Count != 0)
{
// Return the list.
yield return list;
}
}
}
您可以将其用作
var dictionary = DataFileExtensions.ParseDataFile(fileName);
Console.WriteLine("Result of parsing {0}, encountered {1} data arrays:", fileName, dictionary.Count);
foreach (var pair in dictionary)
{
var name = pair.Key;
var data = pair.Value;
Console.WriteLine(" Data row name = {0}, values = [{1}]", name, string.Join(",", data));
}
哪个输出:
Result of parsing Question49341548.txt, encountered 3 data arrays:
Data row name = w1, values = [1,2,3]
Data row name = w2, values = [3,4,5]
Data row name = w3, values = [4,5,6]
注意:
-
我使用
NumberFormatInfo.InvariantInfo
为了确保在所有语言环境中解析的一致性。
I parse the integer values using
NumberFormatInfo.InvariantInfo
to ensure consistency of parsing in all locales.
我通过使用中方法的轻率修改版本,将文件的行分成两块
I break the lines of the file into chunks of two by using a lightly modified version of the method from this answer to Split List into Sublists with LINQ by casperOne.
将文件分成几行后,我修剪;。
每对的第一行,并将其用作字典键。每对中的第二行将解析为整数数组。
After breaking the file into chunks of pairs of lines, I trim the ;
from the first line in each pair and use that as the dictionary key. The second line in each pair gets parsed into an array of integer values.
如果名称 w1
, w2
等不是唯一的,您可以反序列化为 查找< string,int []>
通过替换 ToDictionary()
与 ToLookup()
。
If the names w1
, w2
and so on are not unique, you could deserialize instead into a Lookup<string, int []>
by replacing ToDictionary()
with ToLookup()
.
而不是使用 File.ReadAllLines()
,尽管它依次使用 File.ReadLines()
。这样可以减少内存使用量,而不会增加任何复杂性。
Rather than loading the entire file into memory upfront using File.ReadAllLines()
, I enumerate though it sequentially using File.ReadLines()
. This should reduce memory usage without any additional complexity.
示例工作。net小提琴。
这篇关于如何解析名称和整数列表交替的文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!