在Matlab中有没有一种方法来确定文件中的行数,而不是循环通过每一行? [英] Is there a way in Matlab to determine the number of lines in a file without looping through each line?

查看:1007
本文介绍了在Matlab中有没有一种方法来确定文件中的行数,而不是循环通过每一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

显然,可以使用fgetl或类似的函数遍历一个文件,并增加一个计数器,但是有没有一种方法可以确定文件中的行数,而不用做这样的循环?

解决方案

我完全喜欢使用下面的代码:

pre > fid = fopen('someTextFile.txt','rb');
%#获取文件大小。
fseek(fid,0,'eof');
fileSize = ftell(fid);
frewind(fid);
%#阅读整个文件。
data = fread(fid,fileSize,'uint8');
%#计算换行数并增加1。
numLines = sum(data == 10)+ 1;
fclose(fid);

如果您有足够的内存一次读取整个文件,速度相当快。它应该适用于Windows和Linux风格的行结束。



编辑:我测量了迄今为止提供的答案的性能。以下是确定包含1百万个双精度值(每行一个值)的文本文件的行数的结果。平均10次尝试。

 作者平均时间+  - 标准差(s)
------- -----------------------------------------------
Rody Oldenhuis 0.3189 + - 0.0314
Edric(2)0.3282 + - 0.0248
Mehrwolf 0.4075 + - 0.0178
Jonas 1.0813 + - 0.0665
Edric(1)26.8825 + - 0.6790

使用Perl并将所有文件读为二进制数据的方法最快。我不会感到惊讶,如果Perl内部也一次读取大块文件,而不是逐行循环(只是一个猜测,不知道有关Perl的任何内容)。

使用一个简单的 fgetl() -loop是比其他方法慢25-75倍。


$ b $编辑2:包括Edric的第二种方法,这比Perl解决方案更快速,并且与Perl解决方案相提并论。


Obviously one could loop through a file using fgetl or similar function and increment a counter, but is there a way to determine the number of lines in a file without doing such a loop?

解决方案

I like to use the following code for exactly this task

fid = fopen('someTextFile.txt', 'rb');
%# Get file size.
fseek(fid, 0, 'eof');
fileSize = ftell(fid);
frewind(fid);
%# Read the whole file.
data = fread(fid, fileSize, 'uint8');
%# Count number of line-feeds and increase by one.
numLines = sum(data == 10) + 1;
fclose(fid);

It is pretty fast if you have enough memory to read the whole file at once. It should work for both Windows- and Linux-style line endings.

Edit: I measured the performance of the answers provided so far. Here is the result for determining the number of lines of a text file containing 1 million double values (one value per line). Average of 10 tries.

 Author           Mean time +- standard deviation (s)
------------------------------------------------------
 Rody Oldenhuis      0.3189 +- 0.0314
 Edric (2)           0.3282 +- 0.0248
 Mehrwolf            0.4075 +- 0.0178
 Jonas               1.0813 +- 0.0665
 Edric (1)          26.8825 +- 0.6790

So fastest are the approaches using Perl and reading all the file as binary data. I would not be surprised, if Perl internally also read large blocks of the file at once instead of looping through it line by line (just a guess, do not know anything about Perl).

Using a simple fgetl()-loop is by a factor of 25-75 slower than the other approaches.

Edit 2: Included Edric's 2nd approach, which is much faster and on-par with the Perl solution, I'd say.

这篇关于在Matlab中有没有一种方法来确定文件中的行数,而不是循环通过每一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆