阅读空间,快速,高效地分隔的数字文件到一个数组? [英] Fast and efficient way to read a space separated file of numbers into an array?

查看:84
本文介绍了阅读空间,快速,高效地分隔的数字文件到一个数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个快速,高效的方法来读取与数字空格分隔的文件到一个数组。这些文件的格式是这样的:

I need a fast and efficient method to read a space separated file with numbers into an array. The files are formatted this way:

4 6
1 2 3 4 5 6
2 5 4 3 21111 101
3 5 6234 1 2 3
4 2 33434 4 5 6

第一行是数组[行列]的尺寸。下面的行包含数组数据。

The first row is the dimension of the array [rows columns]. The lines following contain the array data.

该数据也没有任何这样的换行符进行格式化:

The data may also be formatted without any newlines like this:

4 6
1 2 3 4 5 6 2 5 4 3 21111 101 3 5 6234 1 2 3 4 2 33434 4 5 6

我可以读取第一行和初始化与行和列的值的数组。然后,我需要的数据填充值的数组。我的第一个想法是逐行读取文件中的行,并使用拆分功能。但上市第二格式给我停下来,因为整个阵列的数据将被一次性全部加载到内存中。其中一些文件是在的MB的100。第二种方法是将读取数据块文件,然后一块解析这些作品。也许别人有这样做的更好的方法吗?

I can read the first line and initialize an array with the row and column values. Then I need to fill the array with the data values. My first idea was to read the file line by line and use the split function. But the second format listed gives me pause, because the entire array data would be loaded into memory all at once. Some of these files are in the 100 of MBs. The second method would be to read the file in chunks and then parse them piece by piece. Maybe somebody else has a better a way of doing this?

推荐答案

如何

    static void Main()
    {
        // sample data
        File.WriteAllText("my.data", @"4 6
1 2 3 4 5 6
2 5 4 3 21111 101
3 5 6234 1 2 3
4 2 33434 4 5 6");

        using (Stream s = new BufferedStream(File.OpenRead("my.data")))
        {
            int rows = ReadInt32(s), cols = ReadInt32(s);
            int[,] arr = new int[rows, cols];
            for(int y = 0 ; y < rows ; y++)
                for (int x = 0; x < cols; x++)
                {
                    arr[y, x] = ReadInt32(s);
                }
        }
    }

    private static int ReadInt32(Stream s)
    { // edited to improve handling of multiple spaces etc
        int b;
        // skip any preceeding
        while ((b = s.ReadByte()) >= 0 && (b < '0' || b > '9')) {  }
        if (b < 0) throw new EndOfStreamException();

        int result = b - '0';
        while ((b = s.ReadByte()) >= '0' && b <= '9')
        {
            result = result * 10 + (b - '0');
        }
        return result;
    }

其实,这不是非常具体的分隔符 - 它会pretty太多假设,任何不是一个整数是一个分隔符,它仅支持ASCII(你使用使用一个读者,如果你需要其他的编码)。

Actually, this isn't very specific about the delimiters - it'll pretty much assume that anything that isn't an integer is a delimiter, and it only supports ASCII (you use use a reader if you need other encodings).

这篇关于阅读空间,快速,高效地分隔的数字文件到一个数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆