解析一个逗号用C使用的fscanf分隔的文件() [英] Parsing a comma separated file using C using fscanf()
问题描述
我有一个数据是这样的一个文件 -
I have a file with data something like this -
Name, Age, Occupation
John, 14, Student
George, 14, Student
William, 23, Programmer
现在,我想读这样每个值(如姓名,年龄等)读出的字符串的数据
这是我的code片段 - 。
Now, I want to read the data such that each value (e.g. Name, Age etc.) are read as a string.
This is my code snippet -
....
if (!(ferror(input_fp) || ferror(output_fp))) {
while(fscanf(input_fp, "%30[^ ,\n\t]%30[^ ,\n\t]%30[^ ,\n\t]",
name, age_array, occupation) != EOF){
fprintf(stdout, "%-30s%-30s%-30s\n", name, age_array, occupation);
}
fclose(input_fp);
fclose(output_fp);
}
....
不过,这将进入一个无限循环给一些随机的输出。
这就是我所理解我的输入转换说明
30%[^,\\ n \\ t]
- >读取字符串,它是在最多30个字符长,而且
不包括无论是空格,逗号,换行符或制表符。
而我读3这样的字符串。
我在哪里去了?
However, this goes into an infinite loop giving some random output.
This is how I understand my input conversion specifiers
.
%30[^ ,\n\t]
-> read a string that is at the maximum 30 characters long and that
DOES NOT include either a space, a comma, a newline or a tab character.
And I am reading 3 such strings.
Where am I going wrong?
推荐答案
OP的
fscanf(input_fp, "%30[^ ,\n\t]%30[^ ,\n\t]%30[^ ,\n\t]", ...
不消耗,
还是的'\\ n'
在文本文件中。随后的fscanf()
尝试也失败并返回一个0值,这不是 EOF
,会导致一个无限循环。
does not consume the ','
nor the '\n'
in the text file. Subsequent fscanf()
attempts also fail and return a value of 0, which not being EOF
, causes an infinite loop.
虽然OP请求的fscanf()
液,与fgets()/的sscanf()
更好地处理潜在的IO和解析错误。
Although OP requested a fscanf()
solution, a fgets()/sscanf()
better handles potential IO and parsing errors.
FILE *input_fp;
FILE *output_fp;
char buf[100];
while (fgets(buf, sizeof buf, input_fp) != NULL) {
char name[30]; // Insure this size is 1 more than the width in scanf format.
char age_array[30];
char occupation[30];
#define VFMT " %29[^ ,\n\t]"
int n; // Use to check for trailing junk
if (3 == sscanf(buf, VFMT "," VFMT "," VFMT " %n", name, age_array,
occupation, &n) && buf[n] == '\0') {
// Suspect OP really wants this width to be 1 more
if (fprintf(output_fp, "%-30s%-30s%-30s\n", name, age_array, occupation) < 0)
break;
} else
break; // format error
}
fclose(input_fp);
fclose(output_fp);
而不是调用 FERROR()
,检查的返回值与fgets()
, fprintf中()
。
可疑OP未申报的场缓冲器均 [30]
和调整 scanf()的
相应。
Suspect OP's undeclared field buffers were [30]
and adjusted scanf()
accordingly.
关于详细信息,如果(3 ==的sscanf(BUF,VFMT,...
的如果(3 ==的sscanf(...)及和放大器; BUF [N] =='\\ 0'){
为真时:
1)正是3 %29 [^ \\ n \\ t]
格式说明至少在1每个scanf函数字符
每个。结果
2) BUF [N]
是字符串的结尾。 N
通过%N
说明设置。在preceding 在
%N
最后<$ C后会引起任何下列空白$ C>%29 [^ \\ n \\ t]被消耗。 scanf()的
看到%N
,它引导它来设置电流从扫描开始偏移是分配给 INT
按&放指了指; N
The if (3 == sscanf(...) && buf[n] == '\0') {
becomes true when:
1) exactly the 3 "%29[^ ,\n\t]"
format specifiers each scanf in at least 1 char
each.
2) buf[n]
is the end of the string. n
is set via the "%n"
specifier. The preceding ' '
in " %n"
causes any following white-space after the last "%29[^ ,\n\t]"
to be consumed. scanf()
sees "%n"
, which directs it to set the current offset from the beginning of scanning to be assign to the int
pointed to by &n
.
VFMT,VFMT,VFMT%N
是由编译器连接起来结果%29 [^ \\ n \\ t],%29 [^ \\ n \\ t],%29 [^ \\ n \\ T]%N
。结果
我觉得前者更容易比后者维护。
"VFMT "," VFMT "," VFMT " %n"
is concatenated by the compiler to
" %29[^ ,\n\t], %29[^ ,\n\t], %29[^ ,\n\t] %n"
.
I find the former easier to maintain than the latter.
在%29 [^ \\ n \\ t]
指引的sscanf()
来的首次太空扫描结束(消费,而不是保存)0个或更多的空格(,
'\\ T'
,的'\\ n'
等)。其余的指导的sscanf()
消耗,节省的任何的1至29 字符
除了的,
,的'\\ n'
,'\\ t
,然后附加一个'\\ 0'
。
The first space in " %29[^ ,\n\t]"
directs sscanf()
to scan over (consume and not save) 0 or more white-spaces (' '
, '\t'
, '\n'
, etc.). The rest directs sscanf()
to consume and save any 1 to 29 char
except ','
, '\n'
, '\t'
, then append a '\0'
.
这篇关于解析一个逗号用C使用的fscanf分隔的文件()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!