如何解析名称和整数列表交替的文本文件? [英] How to parse a text file with alternating lines of names and lists of integers?

查看:72
本文介绍了如何解析名称和整数列表交替的文本文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要读取一个文件并将该数据放入不同的数组中。

I need to read a file and put that data inside to different arrays.

我的.txt文件如下:

My .txt file looks like:

w1;
1 2 3
w2;
3 4 5
w3;
4 5 6

我尝试了以下操作:

int[] w1 = new int [3];
int[] w2 = new int [3];
int[] w3 = new int [3];

string v = "w1:|w2:|w3:";
foreach (string line in File.ReadAllLines(@"D:\\Data.txt"))
{
   string[] parts = Regex.Split(line, v);

我得到了那个字符串,但是我不知道如何将它的每个元素切成上面显示的数组。

I got that string but I have no idea how to cut every element of it to arrays showed above.

推荐答案

而不是解析文件并将数组放入对应于硬编码名称 w1 <的三个硬编码变量中/ code>, w2 w3 ,我将删除硬编码并将文件解析为 Dictionary< string,int []> 像这样:

Rather than parsing the file and putting the arrays into three hardcoded variables corresponding to hardcoded names w1, w2 and w3, I would remove the hardcoding and parse the file into a Dictionary<string, int[]> like so:

public static class DataFileExtensions
{
    public static Dictionary<string, int[]> ParseDataFile(string fileName)
    {
        var separators = new [] { ' ' };
        var query = from pair in File.ReadLines(fileName).Chunk(2)
                    let key = pair[0].TrimEnd(';')
                    let value = (pair.Count < 2 ? "" : pair[1]).Split(separators, StringSplitOptions.RemoveEmptyEntries).Select(s => int.Parse(s, NumberFormatInfo.InvariantInfo)).ToArray()
                    select new { key, value };
        return query.ToDictionary(p => p.key, p => p.value);
    }
}

public static class EnumerableExtensions
{
    // Adapted from the answer to "Split List into Sublists with LINQ" by casperOne
    // https://stackoverflow.com/questions/419019/split-list-into-sublists-with-linq/
    // https://stackoverflow.com/a/419058
    // https://stackoverflow.com/users/50776/casperone
    public static IEnumerable<List<T>> Chunk<T>(this IEnumerable<T> enumerable, int groupSize)
    {
        // The list to return.
        List<T> list = new List<T>(groupSize);

        // Cycle through all of the items.
        foreach (T item in enumerable)
        {
            // Add the item.
            list.Add(item);

            // If the list has the number of elements, return that.
            if (list.Count == groupSize)
            {
                // Return the list.
                yield return list;

                // Set the list to a new list.
                list = new List<T>(groupSize);
            }
        }

        // Return the remainder if there is any,
        if (list.Count != 0)
        {
            // Return the list.
            yield return list;
        }
    }
}

您可以将其用作

var dictionary = DataFileExtensions.ParseDataFile(fileName);

Console.WriteLine("Result of parsing {0}, encountered {1} data arrays:", fileName, dictionary.Count);
foreach (var pair in dictionary)
{
    var name = pair.Key;
    var data = pair.Value;

    Console.WriteLine("  Data row name = {0}, values = [{1}]", name, string.Join(",", data));
}       

哪个输出:

Result of parsing Question49341548.txt, encountered 3 data arrays:
  Data row name = w1, values = [1,2,3]
  Data row name = w2, values = [3,4,5]
  Data row name = w3, values = [4,5,6]

注意:

  • I parse the integer values using NumberFormatInfo.InvariantInfo to ensure consistency of parsing in all locales.

我通过使用中方法的轻率修改版本,将文件的行分成两块 此答案 -with-linq />使用LINQ将列表拆分为子列表 ,方法是 casperOne

I break the lines of the file into chunks of two by using a lightly modified version of the method from this answer to Split List into Sublists with LINQ by casperOne.

将文件分成几行后,我修剪;。 每对的第一行,并将其用作字典键。每对中的第二行将解析为整数数组。

After breaking the file into chunks of pairs of lines, I trim the ; from the first line in each pair and use that as the dictionary key. The second line in each pair gets parsed into an array of integer values.

如果名称 w1 w2 等不是唯一的,您可以反序列化为 查找< string,int []> 通过替换 ToDictionary() ToLookup()

If the names w1, w2 and so on are not unique, you could deserialize instead into a Lookup<string, int []> by replacing ToDictionary() with ToLookup().

而不是使用 File.ReadAllLines() ,尽管它依次使用 File.ReadLines() 。这样可以减少内存使用量,而不会增加任何复杂性。

Rather than loading the entire file into memory upfront using File.ReadAllLines(), I enumerate though it sequentially using File.ReadLines(). This should reduce memory usage without any additional complexity.

示例工作。net小提琴

这篇关于如何解析名称和整数列表交替的文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆