在fread优化中练习 [英] An exercise in fread optimisation

查看:49
本文介绍了在fread优化中练习的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好。


顺便说一句,感谢这个小组中的每个人,你的集体建议已经很有帮助了。我不得不说,C家伙肯定比Lisp家伙好多了b $ b ;-)。


今晚,我在思考freads,以及如何让他们更快。我最初写了一个这样的程序:


- 开始代码 -

#include< stdio.h>

#include< time.h>


#define BUFSIZE 32768

#define CHAR_SIZE 1


int main(int argc,char ** argv){

clock_t start = clock();


char buf [BUFSIZE + 1];

memset(buf,''\''',BUFSIZE + 1);


FILE * file = fopen(" stuff。 txt"," rb");


while(fread(buf,BUFSIZE,CHAR_SIZE,file)){

// printf("%) s",buf);

memset(buf,''\'',BUFSIZE);

}

// printf( "%s",buf);


fclose(文件);


clock_t finish = clock();

printf(总CPU时间:%d \ n,完成 - 开始);

}

- 结束代码 -


平均CPU时间:140-156

BTW,stuff.txt是一个200MB的二进制文件andomness。


我看了一下memset,并想 - 这可能会更快。


- START CODE -

#include< stdio.h>

#include< time.h>


#define BUFSIZE 32768

#define CHAR_SIZE 1

int main(int argc,char ** argv){

clock_t start = clock();


char buf [BUFSIZE + 1];

buf [BUFSIZE] =''\ 0'';


文件* file = fopen(" stuff.txt"," rb");


//准备

fseek(文件,0,SEEK_END);

int size = ftell(file);

int iterations = size / BUFSIZE;

int remaining = size%BUFSIZE;

后退(文件);


/ *

printf(" Size:%d \ n",size);

printf(迭代:%d \ n,迭代);

printf(余数:%d \ n,剩余);

* /


//迭代

i我是;

for(i = 0;我<迭代; i ++){

fread(buf,BUFSIZE,CHAR_SIZE,file);

// printf("%s",buf);

}


fread(buf,BUFSIZE,CHAR_SIZE,file);

buf [剩余] =''\ 0'';

// printf("%s",buf);


fclose(文件);


clock_t finish = clock ();

printf(总CPU时间:%d \ n,完成 - 开始);

}

- 结束代码 -


平均CPU时间:125

大家怎么想?可能会更好吗?


希望这对某人有用。


Chris

解决方案

Khookie写道:


大家好


BTW,感谢这一群体中的每个人,你的集体建议是非常有帮助的。我不得不说,C家伙肯定比Lisp家伙好多了b $ b ;-)。


今晚,我在思考freads,以及如何让他们更快。我最初写了一个这样的程序:


- 开始代码 -

#include< stdio.h>

#include< time.h>


#define BUFSIZE 32768

#define CHAR_SIZE 1


int main(int argc,char ** argv){

clock_t start = clock();


char buf [BUFSIZE + 1];

memset(buf,''\''',BUFSIZE + 1);


FILE * file = fopen(" stuff。 txt"," rb");


while(fread(buf,BUFSIZE,CHAR_SIZE,file)){



这是表达它的一种反常方式,恕我直言......你指的是你想要一个项目(预计需要计数时使用CHAR_SIZE)大小

BUFSIZE。当文件是BUFSIZE字节长的小数倍时,我不确定fread是否能保证生成一致的

数据。


// printf("%s",buf);

memset(buf,''\'',BUFSIZE);

}



[snip]


BTW,stuff.txt是一个200MB的随机二进制文件。

我查看了memset,并认为 - 这可能会更快。



为什么你要memset()?


如果它真的是随机的,那么它可能包含一个''\ 0''当然可以吗?


因此尝试使用''\0''作为终结者是没有意义的。


fread()会告诉你它读了多少项。如果您使用

fread(buf,CHAR_SIZE,BUFSIZE,file)

,您将从中获得有用的返回码(读取的字节数)。


12月10日上午6:55,Khookie< chris.k ... @ gmail.comwrote:

[snip]
< blockquote class =post_quotes>
#define BUFSIZE 32768

#define CHAR_SIZE 1


int main(int argc,char ** argv) {

clock_t start = clock();


char buf [BUFSIZE + 1];



这没用:

/ *


memset(buf, ''\''',BUFSIZE + 1);



* /


>

FILE * file = fopen( " stuff.txt"," rb");



/ *创建一个16K I / O缓冲区:* /

setvbuf(file,NULL,_IOFBF,1024 * 16);


/ *我们应该检查setvbuf的返回,以及文件指针

本身,当然。 * /

/ *我会留给你的。 * /


while(fread(buf,BUFSIZE,CHAR_SIZE,file)){

// printf("%s", buf);

memset(buf,''\'',BUFSIZE);

}

// printf("% s",buf);


fclose(文件);


clock_t finish = clock();

printf(总CPU时间:%d \ n,完成 - 开始);}



[snip]


每个人都在想什么?会更好吗?



1. memset()调用完全没有意义。您将数据缓冲区

设置为零,然后通过读取将其设置为所需值。这不是通过读取设置值而不是


2.如果你想更快地读取,那么通过
$ b $放大读取缓冲区b setvbuff()。这对于写作而言比阅读更有价值,但它应该减少读取的总数。


12月10日,14: 55,Khookie< chris.k ... @ gmail.comwrote:


大家好


BTW,多亏了这个小组中的每个人,你的集体建议都非常有帮助。我不得不说,C家伙肯定比Lisp家伙好多了b $ b ;-)。


今晚,我在思考freads,以及如何让他们更快。我最初写了一个这样的程序:


- 开始代码 -

#include< stdio.h>

#include< time.h>


#define BUFSIZE 32768

#define CHAR_SIZE 1



BUFSIZ在stdio.h中定义。选择(部分)

作为

实现中I / O最有效的大小。除非你有一个非常好的

理由不这样做,你应该用它来代替

随机选择你自己的价值。


Hi everyone

BTW, thanks to everyone in this group, your collective advice has been
very helpful. I have to say, the C guys are definitely much nicer
than the Lisp guys ;-).

Tonight, I was thinking about freads, and how to get them faster. I
initially wrote a program like this:

-- START CODE --
#include <stdio.h>
#include <time.h>

#define BUFSIZE 32768
#define CHAR_SIZE 1

int main(int argc, char **argv) {
clock_t start = clock();

char buf[BUFSIZE + 1];
memset(buf, ''\0'', BUFSIZE + 1);

FILE *file = fopen("stuff.txt", "rb");

while (fread(buf, BUFSIZE, CHAR_SIZE, file)) {
//printf("%s", buf);
memset(buf, ''\0'', BUFSIZE);
}
//printf("%s", buf);

fclose(file);

clock_t finish = clock();
printf("Total CPU time: %d\n", finish - start);
}
-- END CODE --

Average CPU time: 140-156

BTW, stuff.txt is a 200MB binary file of randomness.

I looked at the memsets, and thought - this could maybe be faster.

-- START CODE --
#include <stdio.h>
#include <time.h>

#define BUFSIZE 32768
#define CHAR_SIZE 1

int main(int argc, char **argv) {
clock_t start = clock();

char buf[BUFSIZE + 1];
buf[BUFSIZE] = ''\0'';

FILE *file = fopen("stuff.txt", "rb");

// Prepare
fseek(file, 0 , SEEK_END);
int size = ftell(file);
int iterations = size / BUFSIZE;
int remaining = size % BUFSIZE;
rewind(file);

/*
printf("Size: %d\n", size);
printf("Iterations: %d\n", iterations);
printf("Remainder: %d\n", remaining);
*/

// Iterate
int i;
for (i = 0; i < iterations; i++) {
fread(buf, BUFSIZE, CHAR_SIZE, file);
//printf("%s", buf);
}

fread(buf, BUFSIZE, CHAR_SIZE, file);
buf[remaining] = ''\0'';
//printf("%s", buf);

fclose(file);

clock_t finish = clock();
printf("Total CPU time: %d\n", finish - start);
}
-- END CODE --

Average CPU time: 125

What does everyone think? Could it be better?

Hope this was useful to someone.

Chris

解决方案

Khookie wrote:

Hi everyone

BTW, thanks to everyone in this group, your collective advice has been
very helpful. I have to say, the C guys are definitely much nicer
than the Lisp guys ;-).

Tonight, I was thinking about freads, and how to get them faster. I
initially wrote a program like this:

-- START CODE --
#include <stdio.h>
#include <time.h>

#define BUFSIZE 32768
#define CHAR_SIZE 1

int main(int argc, char **argv) {
clock_t start = clock();

char buf[BUFSIZE + 1];
memset(buf, ''\0'', BUFSIZE + 1);

FILE *file = fopen("stuff.txt", "rb");

while (fread(buf, BUFSIZE, CHAR_SIZE, file)) {

That''s a perverse way of expressing it, IMHO... You are specifying you
want one item (CHAR_SIZE is used where a count is expected) of size
BUFSIZE. I''m not sure whether fread is guaranteed to produce consistent
data when the file is a fractional multiple of BUFSIZE bytes long.

//printf("%s", buf);
memset(buf, ''\0'', BUFSIZE);
}

[snip]

BTW, stuff.txt is a 200MB binary file of randomness.

I looked at the memsets, and thought - this could maybe be faster.

Why are you memset()ing at all?

If it''s truly random, then it could contain a ''\0'' anywhere, surely?

So there''s no point in trying to use ''\0'' as a terminator.

fread() will tell you how many items it read. If you used
fread(buf,CHAR_SIZE,BUFSIZE,file)
you would get a useful return code (the number of bytes read) from it.


On Dec 10, 6:55 am, Khookie <chris.k...@gmail.comwrote:
[snip]

#define BUFSIZE 32768
#define CHAR_SIZE 1

int main(int argc, char **argv) {
clock_t start = clock();

char buf[BUFSIZE + 1];

This is useless:
/*

memset(buf, ''\0'', BUFSIZE + 1);

*/

>
FILE *file = fopen("stuff.txt", "rb");

/* Create a 16K I/O buffer: */
setvbuf ( file , NULL , _IOFBF , 1024*16 );

/* We should check the return of setvbuf, as well as the file pointer
itself, of course. */
/* I will leave that to you. */

while (fread(buf, BUFSIZE, CHAR_SIZE, file)) {
//printf("%s", buf);
memset(buf, ''\0'', BUFSIZE);
}
//printf("%s", buf);

fclose(file);

clock_t finish = clock();
printf("Total CPU time: %d\n", finish - start);}

[snip]

What does everyone think? Could it be better?

1. The memset() calls are totally pointless. You set the data buffer
to zero and then set it to the wanted value via reads. This is not
different than just setting the value via reads.
2. If you want to read faster, then enlarge the read buffer via
setvbuff(). It makes a bigger difference for writing rather than
reading, but it should reduce the total number of reads.


On 10 Dec, 14:55, Khookie <chris.k...@gmail.comwrote:

Hi everyone

BTW, thanks to everyone in this group, your collective advice has been
very helpful. I have to say, the C guys are definitely much nicer
than the Lisp guys ;-).

Tonight, I was thinking about freads, and how to get them faster. I
initially wrote a program like this:

-- START CODE --
#include <stdio.h>
#include <time.h>

#define BUFSIZE 32768
#define CHAR_SIZE 1

BUFSIZ is defined in stdio.h. It is selected (in part)
as the size that is most efficient for I/O on the
implementation. Unless you have a really good
reason not to, you should probably use it instead
of randomly selecting your own value.


这篇关于在fread优化中练习的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆