如何在C#中以有效方式将文本文件转换为二进制文件 [英] How to convert text files to binary in an efficient way in C#

查看:781
本文介绍了如何在C#中以有效方式将文本文件转换为二进制文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我检查了几种将文本文件转换为二进制文件的方法,并在此处找到了一些答案.但是,由于Unity .NET的兼容性,他们中的大多数使我感到困惑,而且我对如何将文本转换为二进制的结构也感到困惑.

I have checked several methods for converting text files to binary and found some answers here as well. However, most of them confused me due to Unity .NET compatibility and I am also confused about the structure of how I convert text to binary.

我有一个文本文件(导出的点云),该文件保存3D空间中点的位置和颜色信息,如下所示:

I have a text file (exported point cloud) which holds positions of points in 3D space and color information like this:

X Y Z colorvalues -0.680891 -90.6809 0 204 204 204 255

X Y Z colorvalues -0.680891 -90.6809 0 204 204 204 255

我正在阅读本文,以使用以下脚本在运行时创建网格:

I was reading this to create meshes in run time with a script like this:

 string[] buffer;

    for (int i = 0; i < Area.nPoints; i++)
    {
        buffer = sr.ReadLine().Split();

        Area.AddPoint(new Vector3(float.Parse(buffer[0]), 
        float.Parse(buffer[1]), float.Parse(buffer[2])));
    }

这有效,但是由于我读取行并拆分了它们,所以速度很慢,并且我的文本文件中大约有7500万行(点).我发现我可以将其转换为二进制文件,并且读取速度会更快,而且速度也快得多.但是,现在转换为二进制部分非常慢,我想问一下我的转换方式.

This works but since I read line and split them it is quite slow and I have around 75 million lines(Points) in my text file. I found out that I can convert it to binary and reading would be faster which I did and it was a lot faster. However, now converting to binary part is quite slow I wanted to ask you about the way I converted.

void WriteValues()
{
    string[] buffer;

    for (int i = 0; i < numPoints; i++)
    {
        buffer = sr.ReadLine().Split();
        for (int j = 0; i < 3; i++)
        {
            wr.Write(float.Parse(buffer[j]));
        }           
    }        
    wr.Close();
}

然后我用BinaryReader.ReadSingle()读取它,但是这比直接从文本中读取要花费更多的时间,因为我再次读取了该行并将其拆分.

Then I read it with BinaryReader.ReadSingle() but this takes a lot more time than reading directly from the text because I again read the line and split it.

我的问题是,我可以说让接下来的1000行缓冲,然后写入而不是读取每一行吗?会有所作为吗?如果是这样,我如何每1000行使用一次流.

My question is could I read lets say next 1000 lines buffer it and then write instead of reading every line? Would it make a difference. If so how can I use stream once for every 1000 lines.

当我将一行转换为二进制时,如何在不拆分字符串的情况下读取行中的每个浮点数?预先感谢您的帮助!

Also when I converted a line to binary how can I read every float in the line without splitting the string? Thanks in advance for any help!

我正在尝试使用增强现实技术来可视化手机中的点云.因此,我想进行扫描,导出点云,将其导入Unity并使用这些点创建网格,而无需进行三角剖分,但是使用我的初始方法,导入它需要15到18分钟.转换为二进制文件后,只需不到3分钟的时间就可以了.但是,这次转换为二进制文件会花费很多时间:)

I am trying to do this for visualizing a point cloud in my mobile phone using Augmented Reality. So I want to do the scan, export the point cloud, import it to Unity and create a mesh by using those points without triangulating but with my initial approach it take 15-18 minutes to import it. After converting to binary it takes less than 3 minutes which is okay. However, converting to binary takes a lot of time this time :)

推荐答案

因此,一种合理的快速读取方法是使用缓冲的文件流.如果不进行浮点解析,则在我的计算机上阅读会花费14 ish秒..进行浮点解析需要74秒ish(我只是总结一下,因为我没有团结精神)

So a reasonably quick way to read is with a buffered file stream. Without the float parsing, the reading takes 14 ish seconds on my machine.... 74 seconds ish with float parsing ( I just summed since I don't have unity to play with )

var sw = new Stopwatch();
sw.Start();
double sum = 0;
var fs = new FileStream("demo.txt", FileMode.Open, FileAccess.Read);
using (var bs = new BufferedStream(fs))
using (var r = new StreamReader(bs))
{
    r.ReadLine();
    while (!r.EndOfStream)
    {
        var l = r.ReadLine();
        var split = l.Split();
        var x = float.Parse(split[0]);
        var y = float.Parse(split[1]);
        var z=float.Parse(split[2]);
        sum += x + y + z;
    }
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds / 1000M);
Console.WriteLine(sum);

出于兴趣,我还更改了代码以将数据写成浮点流(在三胞胎中)

out of interest I also changed the code to write the data out as a stream of floats ( in triplets)

使用

var sw = new Stopwatch();
sw.Start();
double sum = 0;
var fs = new FileStream("demo.bin", FileMode.Open, FileAccess.Read);
using (var bs = new BufferedStream(fs))
using (var r = new BinaryReader(bs))
{
    for (int i = 0; i < 75000000; i++)
    {
        var x = r.ReadSingle();
        var y = r.ReadSingle();
        var z=r.ReadSingle();
        sum += x + y + z;
    }
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds / 1000M);
Console.WriteLine(sum);

大约需要9秒

出于完整性考虑,我使用以下代码生成演示文件.

just for completeness, I used the following code to generate demo files..

   var random = new Random();
    File.WriteAllText("demo.txt", "X         Y        Z colorvalues\r\n");
    using (var fs = new FileStream("demo.bin", FileMode.Create, FileAccess.Write, FileShare.None))
    using (var bw = new BinaryWriter(fs))
    using (var writer = File.AppendText("demo.txt"))
    {
        for (int i = 0; i < 75000000; i++)
        {
            var x = (float) random.NextDouble() * 200;
            var y = (float) random.NextDouble() * 200;
            var z = (float) random.NextDouble() * 200;
            var c = Enumerable.Range(0, 4).Select(n => random.Next(0, 255)).ToArray();
            writer.WriteLine($"{x} {y} {z} {c[0]} {c[1]} {c[2]} {c[3]}");
            bw.Write(x);
            bw.Write(y);
            bw.Write(z);
        }
}

这篇关于如何在C#中以有效方式将文本文件转换为二进制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆