如何在C#中以有效方式将文本文件转换为二进制文件 [英] How to convert text files to binary in an efficient way in C#
问题描述
我检查了几种将文本文件转换为二进制文件的方法,并在此处找到了一些答案.但是,由于Unity .NET的兼容性,他们中的大多数使我感到困惑,而且我对如何将文本转换为二进制的结构也感到困惑.
I have checked several methods for converting text files to binary and found some answers here as well. However, most of them confused me due to Unity .NET compatibility and I am also confused about the structure of how I convert text to binary.
我有一个文本文件(导出的点云),该文件保存3D空间中点的位置和颜色信息,如下所示:
I have a text file (exported point cloud) which holds positions of points in 3D space and color information like this:
X Y Z colorvalues
-0.680891 -90.6809 0 204 204 204 255
X Y Z colorvalues
-0.680891 -90.6809 0 204 204 204 255
我正在阅读本文,以使用以下脚本在运行时创建网格:
I was reading this to create meshes in run time with a script like this:
string[] buffer;
for (int i = 0; i < Area.nPoints; i++)
{
buffer = sr.ReadLine().Split();
Area.AddPoint(new Vector3(float.Parse(buffer[0]),
float.Parse(buffer[1]), float.Parse(buffer[2])));
}
这有效,但是由于我读取行并拆分了它们,所以速度很慢,并且我的文本文件中大约有7500万行(点).我发现我可以将其转换为二进制文件,并且读取速度会更快,而且速度也快得多.但是,现在转换为二进制部分非常慢,我想问一下我的转换方式.
This works but since I read line and split them it is quite slow and I have around 75 million lines(Points) in my text file. I found out that I can convert it to binary and reading would be faster which I did and it was a lot faster. However, now converting to binary part is quite slow I wanted to ask you about the way I converted.
void WriteValues()
{
string[] buffer;
for (int i = 0; i < numPoints; i++)
{
buffer = sr.ReadLine().Split();
for (int j = 0; i < 3; i++)
{
wr.Write(float.Parse(buffer[j]));
}
}
wr.Close();
}
然后我用BinaryReader.ReadSingle()
读取它,但是这比直接从文本中读取要花费更多的时间,因为我再次读取了该行并将其拆分.
Then I read it with BinaryReader.ReadSingle()
but this takes a lot more time than reading directly from the text because I again read the line and split it.
我的问题是,我可以说让接下来的1000行缓冲,然后写入而不是读取每一行吗?会有所作为吗?如果是这样,我如何每1000行使用一次流.
My question is could I read lets say next 1000 lines buffer it and then write instead of reading every line? Would it make a difference. If so how can I use stream once for every 1000 lines.
当我将一行转换为二进制时,如何在不拆分字符串的情况下读取行中的每个浮点数?预先感谢您的帮助!
Also when I converted a line to binary how can I read every float in the line without splitting the string? Thanks in advance for any help!
我正在尝试使用增强现实技术来可视化手机中的点云.因此,我想进行扫描,导出点云,将其导入Unity并使用这些点创建网格,而无需进行三角剖分,但是使用我的初始方法,导入它需要15到18分钟.转换为二进制文件后,只需不到3分钟的时间就可以了.但是,这次转换为二进制文件会花费很多时间:)
I am trying to do this for visualizing a point cloud in my mobile phone using Augmented Reality. So I want to do the scan, export the point cloud, import it to Unity and create a mesh by using those points without triangulating but with my initial approach it take 15-18 minutes to import it. After converting to binary it takes less than 3 minutes which is okay. However, converting to binary takes a lot of time this time :)
推荐答案
因此,一种合理的快速读取方法是使用缓冲的文件流.如果不进行浮点解析,则在我的计算机上阅读会花费14 ish秒..进行浮点解析需要74秒ish(我只是总结一下,因为我没有团结精神)
So a reasonably quick way to read is with a buffered file stream. Without the float parsing, the reading takes 14 ish seconds on my machine.... 74 seconds ish with float parsing ( I just summed since I don't have unity to play with )
var sw = new Stopwatch();
sw.Start();
double sum = 0;
var fs = new FileStream("demo.txt", FileMode.Open, FileAccess.Read);
using (var bs = new BufferedStream(fs))
using (var r = new StreamReader(bs))
{
r.ReadLine();
while (!r.EndOfStream)
{
var l = r.ReadLine();
var split = l.Split();
var x = float.Parse(split[0]);
var y = float.Parse(split[1]);
var z=float.Parse(split[2]);
sum += x + y + z;
}
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds / 1000M);
Console.WriteLine(sum);
出于兴趣,我还更改了代码以将数据写成浮点流(在三胞胎中)
out of interest I also changed the code to write the data out as a stream of floats ( in triplets)
使用
var sw = new Stopwatch();
sw.Start();
double sum = 0;
var fs = new FileStream("demo.bin", FileMode.Open, FileAccess.Read);
using (var bs = new BufferedStream(fs))
using (var r = new BinaryReader(bs))
{
for (int i = 0; i < 75000000; i++)
{
var x = r.ReadSingle();
var y = r.ReadSingle();
var z=r.ReadSingle();
sum += x + y + z;
}
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds / 1000M);
Console.WriteLine(sum);
大约需要9秒
出于完整性考虑,我使用以下代码生成演示文件.
just for completeness, I used the following code to generate demo files..
var random = new Random();
File.WriteAllText("demo.txt", "X Y Z colorvalues\r\n");
using (var fs = new FileStream("demo.bin", FileMode.Create, FileAccess.Write, FileShare.None))
using (var bw = new BinaryWriter(fs))
using (var writer = File.AppendText("demo.txt"))
{
for (int i = 0; i < 75000000; i++)
{
var x = (float) random.NextDouble() * 200;
var y = (float) random.NextDouble() * 200;
var z = (float) random.NextDouble() * 200;
var c = Enumerable.Range(0, 4).Select(n => random.Next(0, 255)).ToArray();
writer.WriteLine($"{x} {y} {z} {c[0]} {c[1]} {c[2]} {c[3]}");
bw.Write(x);
bw.Write(y);
bw.Write(z);
}
}
这篇关于如何在C#中以有效方式将文本文件转换为二进制文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!