如何让我的程序更灵活? [英] How can I make my program more flexible?

查看:75
本文介绍了如何让我的程序更灵活?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好。我用C制作了一个程序,我完成了它。我的问题如下。

我读了一个包含1.000.000个数字的 xxx.in 文件,并使用scanf将它们存储在动态数组中。另外我首先读取文件中的第一个数字,该文件包含1.000.001个数字,命令为 fscanf(fp,%d,& t_size); 只知道,我们想要创建一个大小为1.000.000的数组。然后我用以下代码存储值:



Hello. I made a program in C and I'm finish it. My problem is the following though.
I read a xxx.in file that contains 1.000.000 numbers and store them in an dynamic array using scanf. Also I first read the first number in the file, with that number the file contain 1.000.001 nums, with the command "fscanf(fp, "%d", &t_size);" to know, just for case, that we want to create an array with size of 1.000.000. Then I store the values with the below code:

int i =0; 
for ( i = 0; i < t_size; i++ ) 
{ 
fscanf(fp, "%d",&my_array[i]); 
} 







我的问题不在于编码是我程序的灵活性。我希望从一开始到值存储的过程为1秒,而fscanf不会发生这种情况。



有人能告诉我,如果我们可以在1秒钟内使用动态阵列存储这么多的nums(使用fread或fets),我该如何制作呢?




My problem isn't the coding is the flexibility of my program. I want the procedure from the beginning through the store of values to be 1s and this can't happen with fscanf.

Can someone tell me if we can store so many nums in an dynamic array in 1 second (using fread or fets) and how can i make it?

推荐答案

尝试一次性读取整个文件,对于你所指的大小的文件应该相当快。



然后,当你拥有内存中的所有内容时,应该可以在一秒内完成1,000,000行。下面是一个简单的实现,首先将所有内容读入单个 char * ,然后按行将其拆分为 char * 然后通过调用 atoi 将每行转换为 int

(请注意,此代码未整理,可能需要)。



此程序将文件作为命令行参数读取;



Try reading the entire file in one go, that should be fairly quick for a file of the size you're referring to.

Then, when you have everything in memory, it should be possible to do 1,000,000 rows in one second. Below is a naive implementation that first reads everything into a single char*, then splits that by line into a array of char* and then lastly converts each line to an int by calling atoi.
(Note that this code isn't tidied up, and probably needs to ).

This program takes the file to read as a command line parameter;

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <time.h>

int is_end(char* input) {
	return *input == 0;
}

int is_linebreak(char* input) {
	return *input == '\r' || *input == '\n' || *input == ' ';
}

char* eat_linebreaks(char* input) {
	while (is_linebreak(input))
		++input;

	return input;
}

size_t count_lines(char* input) {
	char* p = input;
	size_t rows = 1;

	if (is_end(p))
		return 0;
	
	while (!is_end(p)) {
		if (is_linebreak(p)) {
			++rows;
			p = eat_linebreaks(p);
		}
		else {
			++p;
		}
	}
	return rows;
}

/* split string by lines */
char** get_lines(char* input, size_t line_count) {
	char* p = input;
	char* from = input;
	size_t length = 0;
	size_t line = 0;
        int i;
	char** lines = (char**)malloc(line_count * sizeof(char*));

	do {
		if (is_end(p) || is_linebreak(p)) {
			lines[line] = (char*)malloc(length + 1);
			for (i = 0; i < length; ++i)
				lines[line][i] = *(from + i);

			lines[line][length] = 0;
			length = 0;
			++line;
			p = eat_linebreaks(p);
			from = p;
			
		}
		else {
			++length;
			++p;
		}
	} while (!is_end(p));

	// Copy the last line as well in case the input doesn't end in line-break
	lines[line] = (char*)malloc(length + 1);
	for (i = 0; i < length; ++i)
		lines[line][i] = *(from + i);

	lines[line][length] = 0;
	++line;


	return lines;
}

int main(int argc, char* argv[]) {
	clock_t start;
	unsigned long microseconds;
	float seconds;
	char** lines;
	size_t size;
	size_t number_of_rows;
	int count;
	int* my_array;
	start = clock();

	FILE *stream;
	char *contents;
	int fileSize = 0;
        int i;

	// Open file, find the size of it
	stream = fopen(argv[1], "rb");
	fseek(stream, 0L, SEEK_END);
	fileSize = ftell(stream);
	fseek(stream, 0L, SEEK_SET);

	// Allocate space for the entire file content
	contents = (char*)malloc(fileSize + 1);

	// Stream file into memory
	size = fread(contents, 1, fileSize, stream);
	contents[size] = 0; 
	fclose(stream);

	// Count rows in content
	number_of_rows = count_lines(contents);

	// Get array of char*, one for each line
	lines = get_lines(contents, number_of_rows);
	
	// Get the numbers out of the lines
	count = atoi(lines[0]); // First row has count
	my_array = (int*)malloc(count * sizeof(int));
	for (i = 0; i < count; ++i) {
		my_array[i] = atoi(lines[i + 1]);
	}

	microseconds = clock() - start;
	seconds = microseconds / 1000000.0f;
	printf("Took %fs", seconds);


	return 0;
}









希望这有帮助,

Fredrik





Hope this helps,
Fredrik


你的问题的标题可能会产生误导。你不是在寻找一个更灵活的程序,而是为了更快的速度。



fscanf是一个相对复杂的功能,因此相对较慢。值得尝试使用fgets然后使用strtol。但即使这样每百万行也可能花费超过1秒。



使输入处理速度更快的原因是使用二进制格式而不是文本格式(参见谢尔盖的评论)。在二进制文件中,您可以按照它在动态数组中的相同方式逐位存储数字。所以你可以在一次调用fread中输入这个数组。这将快几个数量级。



但是:文本编辑器无法读取和修改二进制文件。因此,如果您的客户或教师要求输入为文本文件,则该技术将无效。
The title of your question is probably misleading. You are not looking for a more flexible program, but for more speed.

fscanf is a relatively complex function and hence relatively slow. It is worth a try to use fgets and then strtol instead. But even this might take longer than 1 second per a million lines.

What would make your input processing much faster is to use binary format instead of text format (See the comment of Sergey). In a binary file you could store your numbers bit by bit in the same way it resides in your dynamic array. And so you could input this array in a single call to fread. That would be orders of magnitude faster.

However: A binary file cannot be read and modified by a text editor. So, if your client or teacher request that the input be a text file, that technique won't work.


这篇关于如何让我的程序更灵活?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆