在 C 中拆分字符串以识别连续的制表符 [英] Split String in C to recognize consecutive tabs
问题描述
我有一个文件,其中某些字段由制表符分隔.总是有 17 个标签,但顺序可能会有所不同,例如..
I have a file that has certain fields separated by tabs. There will always be 17 tabs but there order can vary, such as..
75104\tDallas\t85\t34.46\t45.64
75205\tHouston\t\t37.34\t87.32
93434\t\t\t1.23\t3.32
当我以下列方式使用 strtok
时
When I use strtok
in the following fashion
while (fgets(buf, sizeof(buf), fp) != NULL) {
tok = strtok(buf,"\t");
while(tok != NULL) {
printf("%s->",tok);
tok = strtok(NULL,"\t");
}
}
我得到了所有的标记,但是双制表符 \t\t
或更多被忽略了.但是,我需要知道某个字段何时为空,我不能让 strtok
忽略多个选项卡,因为该结构取决于正在计算的 17 个选项卡,如果字段为空,则使用占位符.
I get all the tokens, but double tabs \t\t
or more are ignored. However, I need to know when a field is empty, I cannot have strtok
ignore multiple tabs because the structure depends on 17 tabs being counted, using a placeholder if a field is empty.
我试过用
if(tok == NULL || '')
但我不认为 strtok
在一个标签之后识别一个标签.处理此问题的最佳方法是什么?
but I don't think strtok
recognizes a tab after a tab. What is the best way to deal with this issue?
推荐答案
您不能在您的情况下使用 strtok.来自 man strtok:
You can't use strtok in your case.
From man strtok:
strtok() 函数将字符串分解成零个或多个的序列非空令牌...从上面的描述,可以得出两个或更多的序列解析字符串中的连续分隔符字节被认为是一个单个分隔符,以及位于开头或结尾的分隔符字节字符串被忽略.换句话说:strtok() 返回的标记总是非空字符串.因此,例如,给定字符串"aaa;;bbb,",连续调用指定分隔符的 strtok()字符串;,"将返回字符串aaa"和bbb",然后返回一个空值指针
The strtok() function breaks a string into a sequence of zero or more nonempty tokens ... From the above description, it follows that a sequence of two or more contiguous delimiter bytes in the parsed string is considered to be a single delimiter, and that delimiter bytes at the start or end of the string are ignored. Put another way: the tokens returned by strtok() are always nonempty strings. Thus, for example, given the string "aaa;;bbb,", successive calls to strtok() that specify the delimiter string ";," would return the strings "aaa" and "bbb", and then a null pointer
因此您必须找到替代方法,可以手动编写使用线性搜索和 strncpy
的函数,或者 sscanf
或使用 strsep
,如果有的话.后者很可能是我的选择,因为它旨在替代 strtok.
So you will have to find an alternative, which could either be manually writing a function that uses linear search and strncpy
, or sscanf
or using strsep
, if it is available. The latter would very likely be my choice, because it was intended as replacement for strtok.
来自man strsep:
引入了 strsep() 函数来替代 strtok(3),因为后者不能处理空字段.然而,strtok(3) 可以形式为 C89/C99,因此更具便携性.
The strsep() function was introduced as a replacement for strtok(3), since the latter cannot handle empty fields. However, strtok(3) con‐ forms to C89/C99 and hence is more portable.
这篇关于在 C 中拆分字符串以识别连续的制表符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!