在文件中计数的单词,如 C 中的 linux wc 命令 [英] words counting in file like linux wc command in C
问题描述
我正在尝试编写一些类似于 Linux 命令 wc 的东西来计算任何类型文件中的字数、换行符和字节数,并且我只能使用 C 函数读取.我已经编写了这段代码,我得到了正确的换行符和字节值,但我没有得到正确的计数字值.
I am trying to write something that works like the Linux command wc to count words, new lines and bytes in any kind of files and i can only use the C function read. I have written this code and i am getting the correct values for newlines and bytes but i am not getting the correct value for counted words.
int bytes = 0;
int words = 0;
int newLine = 0;
char buffer[1];
int file = open(myfile,O_RDONLY);
if(file == -1){
printf("can not find :%s
",myfile);
}
else{
char last = 'c';
while(read(file,buffer,1)==1){
bytes++;
if(buffer[0]==' ' && last!=' ' && last!='
'){
words++;
}
else if(buffer[0]=='
'){
newLine++;
if(last!=' ' && last!='
'){
words++;
}
}
last = buffer[0];
}
printf("%d %d %d %s
",newLine,words,bytes,myfile);
}
推荐答案
你应该颠倒你的逻辑.与其寻找空格并增加字数,不如寻找非空格来增加字数.此外,它可以帮助使用状态变量而不是查看最后一个字符:
You should reverse your logic. Rather than look for a space, and increment your word count, look for a non-space to increment the word count. Also, it can help to use a state variable versus looking at the last char:
int main(void)
{
const char *myfile = "test.txt";
int bytes = 0;
int words = 0;
int newLine = 0;
char buffer[1];
int file = open(myfile,O_RDONLY);
enum states { WHITESPACE, WORD };
int state = WHITESPACE;
if(file == -1){
printf("can not find :%s
",myfile);
}
else{
char last = ' ';
while (read(file,buffer,1) ==1 )
{
bytes++;
if ( buffer[0]== ' ' || buffer[0] == ' ' )
{
state = WHITESPACE;
}
else if (buffer[0]=='
')
{
newLine++;
state = WHITESPACE;
}
else
{
if ( state == WHITESPACE )
{
words++;
}
state = WORD;
}
last = buffer[0];
}
printf("%d %d %d %s
",newLine,words,bytes,myfile);
}
}
似乎 wc 有一些关于标点符号不是单词的逻辑,这段代码无法处理.
It appears that wc has some logic with respect to punctuation characters not being words, that this code does not handle.
这篇关于在文件中计数的单词,如 C 中的 linux wc 命令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!