寻找有关如何处理结构数组的建议 [英] Looking for advice on how to deal with array of structs

查看:50
本文介绍了寻找有关如何处理结构数组的建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我一次读一行相当大的文件,做一些

处理,并过滤掉这些行。我将

有趣的信息存储在结构中,然后将其打印出来。这个

没有任何问题。我现在想过滤重复

记录。它们并不是真的重复,我不能只是按原样排序并且

消除匹配的行。我有多个字段将是

相同,但有些字段会有所不同,时间戳可能会被关闭

一两秒。我想删除那些字段

匹配时间戳差小于n

秒的记录。再次,这不是问题,我可以从纪元

得到几秒钟并比较,或者只是使用difftime。我的问题是这个,我想构建

一个结构数组,它将保持匹配的行,在

日期成员上对它们进行排序,然后根据匹配的时间戳消除。


例如在集合中:

foo,bar ...,a,Thu Jan 25 01:40:11 EST 2007
foo,bar ...,a,1月25日星期四01:45:35 2007

foo,bar ...,a,Thu Jan 25 01:48:09 EST 2007

foo,bar ...,b,Thu Jan 25 01:40:12 EST 2007

foo,baz ...,Thu Jan 25 01:40:11 EST 2007


我想将前4行读入结构,将它们存储在

数组中,对它们进行排序,然后将它们打印出来,同时消除

重复第4行。我不想读第5行,因为

它与第一个4-foo不同,baz而不是foo,bar。我是

忽略了包含a或b的字段(简单的原因之一是
排序不起作用)


我的问题来自于不知道创建数组的位置,分配的大小

,以及当我移动到下一个设置时如何重新初始化

排序。


我已经包含了伪代码和真实代码的混合。我也把

一切都变得通用了。我的最终问题是:


*在while

循环之外声明struct实例是一个好主意,并且每次循环都重新初始化它,或者它会更好地将它变成本地的吗?真正的结构是132个字节。


*这是(重新)初始化结构(init_cr)的最佳方法吗?


*如何我是否(重新)初始化一个struct数组?


*关于我打算如何解决这个问题的任何意见


代码:

#include< stdio.h>

#include< stdlib.h>

#include< time.h>


typedef struct {

char field1 [11];

char field1 [11];

time_t date_secs;

}记录;


int main(int argc,char * argv []){

FILE * fp;

fp = stdin;

记录cr,lr;

int len = 512;

char buf [len + 1] ;


记录[11];


fp = fopen(argv [1]," r");

if(fp == NULL){

fputs(无法打开文件进行阅读,stderr);

exit(1);

}


while(fgets(buf,len,fp)){

init_cr(& cr);

/ *将行拆分为字段* /

/ *处理字段并保存在struct * /


/ *如果感兴趣的字段匹配最后的字段* /

/ *将struct保存在数组的下一个位置* /

/ *增量计数器,用于保存多少结构* /

/ *调整数组大小以适应更多结构(如果需要)

* /

/ * else * /

/ *打印数组结构* /

/ *重置数组??? * /

}

返回0;

}


void init_cr(callrecord * cr) {

cr-> field1 [0] =''\ 0'';

cr-> field2 [0] =''\''' ;

cr-> date_secs = 0;

}

Cliff

Hi,

I am reading a fairly large file a line at a time, doing some
processing, and filtering out bits of the line. I am storing the
interesting information in a struct and then printing it out. This
works without any problems. I now would like to filter "duplicate"
records. They aren''t really duplicate, I can''t just qsort as is and
eliminate the matching rows. I have number of fields that will be the
same, but some of the fields will differ and the timestamp may be off
by a second or two. I want to eliminate records that have the fields
that match where the difference between timestamps is less than n
seconds. Again, this is not a problem, I can get seconds since epoch
and compare, or just use difftime. My problem is this, I want to build
an array of structs, that will hold lines that match, sort them on the
date member and then eliminate based on matching timestamps.

For example in the set:
foo,bar ...,a, Thu Jan 25 01:40:11 EST 2007
foo,bar ...,a, Thu Jan 25 01:45:35 EST 2007
foo,bar ...,a, Thu Jan 25 01:48:09 EST 2007
foo,bar ...,b, Thu Jan 25 01:40:12 EST 2007
foo,baz ..., Thu Jan 25 01:40:11 EST 2007

I would like to read the first 4 lines into structs, store them in
array, sort them and then print them out, while eliminating the
"duplicate" line 4. I would not want to read the 5th line, yet, because
it is dissimilar to the first 4 - foo, baz instead of foo,bar. I am
ignoring the field that has a or b in it (one of the reasons a simple
sort will not work)

My problem comes from not knowing where to create the array, what size
to allocate, and how to re-initialize it when I move to the next set to
sort.

I have included a mix of psuedo code and real code. I have also made
everything generic. My ultimate questions are:

* Is it a good idea to declare the struct instance outside the while
loop and the reinitialize it every time through the loop, or would it
be better to make it local? The real struct is 132 bytes.

* Is this the best way to (re)initialize a struct (init_cr)?

* How do I (re)initialize an array of struct?

* any comments on how I plan to tackle this

Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

typedef struct {
char field1[11];
char field1[11];
time_t date_secs;
} record;

int main(int argc, char *argv[]) {
FILE *fp;
fp = stdin;
record cr, lr;
int len = 512;
char buf[len+1];

record records[11];

fp = fopen(argv[1], "r");
if(fp == NULL) {
fputs("Could not open file for reading", stderr);
exit(1);
}

while(fgets(buf, len, fp)) {
init_cr(&cr);

/* split line into fields */
/* process fields and save in struct */

/* if the fields of interest match last fields */
/* save struct in next position in array */
/* increment counter for how many structs are saved */
/* resize the array to accommodate more structs if needed
*/
/* else */
/* print array of structs */
/* reset array ??? */
}
return 0;
}

void init_cr(callrecord *cr) {
cr->field1[0] = ''\0'';
cr->field2[0] = ''\0'';
cr->date_secs = 0;
}
Cliff

推荐答案

Cliff Martin写道:
Cliff Martin wrote:




我读的相当大一次提交一行,做一些

处理,并过滤出该行的位。我将

有趣的信息存储在结构中,然后将其打印出来。这个

没有任何问题。我现在想过滤重复

记录。它们并不是真的重复,我不能只是按原样排序并且

消除匹配的行。我有多个字段将是

相同,但有些字段会有所不同,时间戳可能会被关闭

一两秒。我想删除那些字段

匹配时间戳差小于n

秒的记录。再次,这不是问题,我可以从纪元

得到几秒钟并比较,或者只是使用difftime。我的问题是这个,我想构建

一个结构数组,它将保持匹配的行,在

日期成员上对它们进行排序,然后根据匹配的时间戳消除。


例如在集合中:

foo,bar ...,a,Thu Jan 25 01:40:11 EST 2007
foo,bar ...,a,1月25日星期四01:45:35 2007

foo,bar ...,a,Thu Jan 25 01:48:09 EST 2007

foo,bar ...,b,Thu Jan 25 01:40:12 EST 2007

foo,baz ...,Thu Jan 25 01:40:11 EST 2007


我想将前4行读入结构,将它们存储在

数组中,对它们进行排序,然后将它们打印出来,同时消除

重复第4行。我不想读第5行,因为

它与第一个4-foo不同,baz而不是foo,bar。我是

忽略了包含a或b的字段(简单的原因之一是
排序不起作用)


我的问题来自于不知道创建数组的位置,分配的大小

,以及当我移动到下一个设置时如何重新初始化

分类。
Hi,

I am reading a fairly large file a line at a time, doing some
processing, and filtering out bits of the line. I am storing the
interesting information in a struct and then printing it out. This
works without any problems. I now would like to filter "duplicate"
records. They aren''t really duplicate, I can''t just qsort as is and
eliminate the matching rows. I have number of fields that will be the
same, but some of the fields will differ and the timestamp may be off
by a second or two. I want to eliminate records that have the fields
that match where the difference between timestamps is less than n
seconds. Again, this is not a problem, I can get seconds since epoch
and compare, or just use difftime. My problem is this, I want to build
an array of structs, that will hold lines that match, sort them on the
date member and then eliminate based on matching timestamps.

For example in the set:
foo,bar ...,a, Thu Jan 25 01:40:11 EST 2007
foo,bar ...,a, Thu Jan 25 01:45:35 EST 2007
foo,bar ...,a, Thu Jan 25 01:48:09 EST 2007
foo,bar ...,b, Thu Jan 25 01:40:12 EST 2007
foo,baz ..., Thu Jan 25 01:40:11 EST 2007

I would like to read the first 4 lines into structs, store them in
array, sort them and then print them out, while eliminating the
"duplicate" line 4. I would not want to read the 5th line, yet, because
it is dissimilar to the first 4 - foo, baz instead of foo,bar. I am
ignoring the field that has a or b in it (one of the reasons a simple
sort will not work)

My problem comes from not knowing where to create the array, what size
to allocate, and how to re-initialize it when I move to the next set to
sort.


>根据您的要求,链接列表似乎更好
>From your requirements it appears that a linked list will be a better



选项比结构数组。当你将
元素添加到列表中或删除它们时,很容易保持它的排序。

option than an array of structs. It''s easy to keep it sorted as you add
elements to the list or remove them.


我已经包含了混合伪代码和真实代码。我也把

一切都变得通用了。我的最终问题是:


*在while

循环之外声明struct实例是一个好主意,并且每次循环都重新初始化它,或者它会更好地将它变成本地的吗?真正的结构是132个字节。
I have included a mix of psuedo code and real code. I have also made
everything generic. My ultimate questions are:

* Is it a good idea to declare the struct instance outside the while
loop and the reinitialize it every time through the loop, or would it
be better to make it local? The real struct is 132 bytes.



取决于结构实例所需的生命周期。

Depends on the desired lifetime for the structure instance.


*这是最好的方式( re)初始化一个struct(init_cr)?


*我如何(重新)初始化一个struct数组?
* Is this the best way to (re)initialize a struct (init_cr)?

* How do I (re)initialize an array of struct?



使用FOR循环。

Use a FOR loop.


*关于我打算如何处理此问题的任何评论

代码:

#include< stdio.h>

#include< stdlib.h>

#include< time.h>


typedef struct {

char field1 [11];

char field1 [11 ]。
* any comments on how I plan to tackle this

Code:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

typedef struct {
char field1[11];
char field1[11];



范围内具有相同标识符的两个对象。

Two objects in scope with the same identifier.


time_t date_secs;

}记录;


int main(int argc,char * argv []){

FILE * fp;

fp = stdin;
time_t date_secs;
} record;

int main(int argc, char *argv[]) {
FILE *fp;
fp = stdin;



这不能保证适用于所有C实现。

This is not guaranteed to work on all C implementations.


record cr,lr;

int len = 512;

char buf [len + 1];
record cr, lr;
int len = 512;
char buf[len+1];



也不是这样。

Nor this.


记录[11];


fp = fopen(argv [1]," r");

if(fp == NULL){

fputs(&无法打开文件阅读,stderr;

退出(1);
record records[11];

fp = fopen(argv[1], "r");
if(fp == NULL) {
fputs("Could not open file for reading", stderr);
exit(1);



使用EXIT_FAILURE而不是1,除非你有充分的理由使用

后者。

Use EXIT_FAILURE instead of 1, unless you have a good reason for using
the latter.


}


while(fgets(buf,len,fp)){
}

while(fgets(buf, len, fp)) {



fgets( )只会将len-1个字符存储到缓冲区中,因此sizeof(buf)

会这样做。

fgets() will store only len-1 characters into the buffer so sizeof(buf)
would do.


init_cr(& cr);


/ *分割成字段* /

/ *处理字段并保存在struct * /


/ *如果感兴趣的字段与最后的字段匹配* /

/ *将struct保存在数组的下一个位置* /

/ *增量计数器,用于保存多少个结构* /

/ *调整数组大小以适应更多结构(如果需要)

* /
init_cr(&cr);

/* split line into fields */
/* process fields and save in struct */

/* if the fields of interest match last fields */
/* save struct in next position in array */
/* increment counter for how many structs are saved */
/* resize the array to accommodate more structs if needed
*/



注意你不能调整一个静态分配的数组,除非它是一个V
。使用malloc()分配并使用realloc()调整大小。

Note that you cannot resize a statically allocated array, unless it''s a
VLA. Allocate using malloc() and resize with realloc().


/ * else * /

/ * print结构数组* /

/ *重置数组??? * /
/* else */
/* print array of structs */
/* reset array ??? */



如果下一次迭代将重写为数组,那么可能这不需要


If the next iteration will rewrite to the array, then probably this is
not needed.


}

返回0;

}


void init_cr(callrecord * cr){

cr-> field1 [0] =''\''';

cr-> field2 [0] =''\''';

cr-> date_secs = 0;

}
}
return 0;
}

void init_cr(callrecord *cr) {
cr->field1[0] = ''\0'';
cr->field2[0] = ''\0'';
cr->date_secs = 0;
}



我仍​​觉得结构的链表可能更好地服务于你的目的。

I still feel a linked list of structures may serve your purpose better.


typedef struct {
typedef struct {

char field1 [11];

char field1 [11];范围内具有相同标识符的两个对象。
char field1[11];
char field1[11];Two objects in scope with the same identifier.



应该是field2,但仅作为示例。真正的代码有大约12个不同的标识符,所有标识符都是唯一的。

should be field2, but just for the example. real code has about 12
different identifiers, all of which are unique.


int main(int argc,char * argv []){

FILE * fp;

fp = stdin;
int main(int argc, char *argv[]) {
FILE *fp;
fp = stdin;



这不能保证适用于所有C实现。

This is not guaranteed to work on all C implementations.



分配给文件指针?所以如果fp为NULL,我应该像普通的

文件一样打开stdin?是强制使用这个用法的编译器选项吗?

我正在使用gcc。

assigning to a file pointer? So I should just open stdin like a regular
file if the fp is NULL? Is the a compiler option to force this usage?
I''m using gcc.


record cr,lr;

int len = 512;

char buf [len + 1];
record cr, lr;
int len = 512;
char buf[len+1];



这也不是。

Nor this.



这有什么问题?

what''s wrong with this?


exit( 1);
exit(1);


使用EXIT_FAILURE而不是1,除非你有充分的理由使用

后者。

Use EXIT_FAILURE instead of 1, unless you have a good reason for using
the latter.



ok。

ok.


while(fgets(buf,len, fp)){
while(fgets(buf, len, fp)) {



fgets()只将len-1个字符存储到缓冲区中,因此sizeof(buf)

就可以了。

fgets() will store only len-1 characters into the buffer so sizeof(buf)
would do.



不知道这个。到处使用这种风格。我会纠正它。

did not know this. Use this style everywhere. I will correct it.


* /注意你不能调整静态分配的数组,除非它是一个

VLA。使用malloc()分配并使用realloc()调整大小。
*/Note that you cannot resize a statically allocated array, unless it''s a
VLA. Allocate using malloc() and resize with realloc().



什么是VLA - 超大阵列?


谢谢,


Cliff

What is a VLA - Very Large Array?

Thanks,

Cliff


好的,我正在使用命令行选项-ansi和-pedantic。他们告诉我一些我做错的事情。


我不明白为什么这是错的:

record cr,lr;


编译器抱怨我混合了声明和代码。


编译器没有抓住指定stdin到fp,我怎么能执行这个呢?


Cliff

OK, I am using the command line options -ansi and -pedantic. They tell
me some things I''m doing wrong.

I don''t understand why this is wrong:

record cr, lr;

The compiler complains I''m mixing declarations and code.

The compiler does not catch the assigning stdin to fp, how can I
enforce that?

Cliff


这篇关于寻找有关如何处理结构数组的建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆