从文件中读取单词 [英] Reading Words from File

查看:57
本文介绍了从文件中读取单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从文件中读取行,然后分开单词,这样我就可以对每个单词进行处理。说文本文件readme.txt

包含以下内容:

面对左翼和右翼的批评,布什总统
周二坚称Harriet Miers是美国最高法院最优秀的候选人b $ b候选人,并向保守派保守派保证他的律师...... b $ b他的律师......


我可以得到一个char * s的输入,这样s =In然后我用s来做
的东西,然后s =the然后我用它做了一些事情,

等。不知道任何字符串或行或空格的长度。


这是我到目前为止所拥有的。


#include< ctype.h>

#include< stdio.h>

#include< stdlib。 h>

#include< string.h>


void process(char * s)/ *这里有什么不重要*

{

printf("%s",s);

}


int main(){


char buffer [80];

FILE * f = fopen(" readme.txt"," r");

char * s;


while(fgets(buffer,sizeof(buffer),f)!= NULL)/ *读一行* /

{

while(sscanf(buffer,"%s",s))/ *扫描行中的单词* /

{

过程; / *做单词* /

}

}


fclose(f);

返回0;


}


另外,无论如何都要调整缓冲区的大小或重新分配

内存因此它不会溢出并出现seg错误。

I want to read in lines from a file and then seperate the words so i
can do a process on each of the words. Say the text file "readme.txt"
contains the following:

In the face of criticism from the left and right, President Bush
insisted Tuesday that Harriet Miers is the nation''s best-qualified
candidate for the Supreme Court and assured skeptical conservatives
that his lawyer...

I could get an input to a char *s such that s = "In" and then i do
something with s, then s = "the" and then i do something with that,
etc. With no idea the length of any string or line or whitespace.

Heres what I have so far.

#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void process(char *s) /* whats here is not really important *
{
printf("%s", s);
}

int main() {

char buffer[80];
FILE *f = fopen("readme.txt", "r");
char *s;

while( fgets(buffer, sizeof(buffer), f) != NULL ) /* reads a line */
{
while( sscanf(buffer, "%s", s) ) /* scans for words in line */
{
process(s); /* do stuff to the words */
}
}

fclose(f);
return 0;

}

Also, is there anyway to adjust the size of the buffer or reallocate
the memory so it doesn''t overflow and get a seg error.

推荐答案

" dough" < 6 **** @ gmail.com>在消息中写道

news:11 ********************* @ f14g2000cwb.googlegro ups.com ...
"dough" <vi****@gmail.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com...
我想从文件中读取行,然后分开单词,这样我就可以对每个单词进行处理。说文本文件readme.txt
包含以下内容:

面对来自左翼和右翼的批评,布什总统周二坚称Harriet Miers是国家最高法院最合格的候选人和保守派保守派保证他的律师......

我可以得到一个char * s的输入s =In然后我用s做某事,然后s =the然后我做了一些事情,
等。不知道任何字符串或行或空格的长度。
I want to read in lines from a file and then seperate the words so i
can do a process on each of the words. Say the text file "readme.txt"
contains the following:

In the face of criticism from the left and right, President Bush
insisted Tuesday that Harriet Miers is the nation''s best-qualified
candidate for the Supreme Court and assured skeptical conservatives
that his lawyer...

I could get an input to a char *s such that s = "In" and then i do
something with s, then s = "the" and then i do something with that,
etc. With no idea the length of any string or line or whitespace.




我不想是苛刻的,但在我看来第2段是关于主题

并且不明智的海报寻求帮助...


亚历克斯



I don''t want to be harsh, but it seems to me the 2nd paragraph is off topic
and unwise for a poster looking for help...

Alex


在文章< 11 ********************* @ f14g2000cwb.googlegroups中。 com>,

dough< vi **** @ gmail.com>写道:

:我想从文件中读取行,然后分开单词,这样我就可以对每个单词进行处理。


在决定

a" word"时,通常会有一个非平凡的语义问题。就是这样的事情。例如,在


哦!中。他大喊(进入他的Hello-Kitty手机。)


然后,如果你按空格划分,你会得到单词例如


哦!,和(进入和电话)和Hello-Kitty


这通常不是你想要的细分。

-

这些.signatures按体积销售,而不是按重量销售。
In article <11*********************@f14g2000cwb.googlegroups. com>,
dough <vi****@gmail.com> wrote:
:I want to read in lines from a file and then seperate the words so i
:can do a process on each of the words.

There is often a non-trivial semantic problem in deciding what
a "word" is in such matters. For example, in

"Oh!," he yelled (into his Hello-Kitty phone.)

then if you go by whitespace you get "words" such as

"Oh!," and (into and phone.) and Hello-Kitty

which is usually not the breakdown you want.
--
These .signatures are sold by volume, and not by weight.





dough写道于10/04/05 14: 39,:


dough wrote On 10/04/05 14:39,:
我想从文件中读取行,然后分开单词,这样我就可以对每个单词进行处理。说文本文件readme.txt
包含以下内容:

面对来自左翼和右翼的批评,布什总统周二坚称Harriet Miers是国家最高法院最合格的候选人和保守派保守派保证他的律师......

我可以得到一个char * s的输入s =In然后我用s做某事,然后s =the然后我做了一些事情,
等。不知道任何字符串或行或空格的长度。

到目前为止我所拥有的。

#include< ctype.h>
#include< stdio.h>
#include< stdlib.h>
#include< string.h>

void process(char * s)/ *这里什么不重要*
{
printf("%s",s);
}
int main(){

char buffer [80];
FILE * f = fopen(" readme.txt"," r");
char * s ;




继续之前测试`f == NULL''是个好主意...

while( fgets(buffer,sizeof(buffer),f)!= NULL)/ *读取一行* /
{while(sscanf(buffer,"%s",s))/ *扫描行中的单词* /


这里有一个问题:`s''没有指向任何东西,所以

当scanf()找到一个单词,并试图将其复制到

记忆中的点,可以随之发生各种恶作剧。

{
process(s); / *做的话* /
}
}

fclose(f);
返回0;

}


另外,无论如何都要调整缓冲区的大小或重新分配内存,这样它就不会溢出并出现seg错误。
I want to read in lines from a file and then seperate the words so i
can do a process on each of the words. Say the text file "readme.txt"
contains the following:

In the face of criticism from the left and right, President Bush
insisted Tuesday that Harriet Miers is the nation''s best-qualified
candidate for the Supreme Court and assured skeptical conservatives
that his lawyer...

I could get an input to a char *s such that s = "In" and then i do
something with s, then s = "the" and then i do something with that,
etc. With no idea the length of any string or line or whitespace.

Heres what I have so far.

#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void process(char *s) /* whats here is not really important *
{
printf("%s", s);
}

int main() {

char buffer[80];
FILE *f = fopen("readme.txt", "r");
char *s;
It would be a good idea to test `f == NULL'' before
proceeding ...
while( fgets(buffer, sizeof(buffer), f) != NULL ) /* reads a line */
{
while( sscanf(buffer, "%s", s) ) /* scans for words in line */
Here''s a problem: `s'' doesn''t point to anything, so
when scanf() locates a word and tries to copy it to the
memory `s'' points at, all manner of mischief can ensue.
{
process(s); /* do stuff to the words */
}
}

fclose(f);
return 0;

}
Also, is there anyway to adjust the size of the buffer or reallocate
the memory so it doesn''t overflow and get a seg error.



如果你使用malloc()为`buffer''创建空间,你可以使用realloc()来放大它。但眼前的问题

不是'缓冲''的大小,而是未初始化的's''。


您的整体任务听起来像是一份工作备受诟病的

strtok()函数。但是,请参阅Walter Roberson的帖子

使用简单的字符串抨击来分离

" words"来自周围的环境。


-
Er **** *****@sun.com



If you used malloc() to create the space for `buffer'', you
could use realloc() to enlarge it. But the immediate problem
is not the size of `buffer'', but the uninitialized `s''.

Your overall task sounds like a job for the much-maligned
strtok() function. However, see Walter Roberson''s post for
some of the pitfalls of using simple string-bashing to separate
"words" from their surroundings.

--
Er*********@sun.com


这篇关于从文件中读取单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆