字符串解析问题 [英] String parsing question

查看：76 发布时间：2019/6/5 7:42:36 c

本文介绍了字符串解析问题的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想知道执行以下操作的最佳方法：

我有一个以分号分隔的字符串。分隔的项目可以是以下任何格式：

1）14个孤立字符

2）5个alphanums space 8个alphanums

3）6个alphanums冒号8个alphanums

4）5个alphanums冒号8个alphanums

我的任务是转换第三种格式的项目到第一种格式，和第四种格式的物品

到第二种格式。另外，我需要计算字符串中的项目数量

，其中可能有也可能没有结尾分号。

我的计划（我觉得是次优的 - 因此这篇文章），是一次性地逐步通过
一个字符的初始字符串，以便在一个

传递中完成这些事情。虽然我可以使用strchr（）轻松计算分号，但正确删除

冒号意味着无论如何都要踩过整个字符串（对吗？）所以

我可以算半分号同时。我还想验证

数据格式（即不允许使用15个字符的项目）。

int myfunc（const char * list ）

{

int items = 0;

char * cp = strdup（idlist）; / * nonstandard * /

char * newstr = cp;

int shifting = 0;

int chars = 0;

for（; * cp; * cp ++）{

if（* cp ==''：''）{

if（chars = = 6）{

轮班++;

继续;

}

if（chars == 5）{

*（cp-shifting）='''';

chars ++;

继续;

}

return（-1）; / *错误* /

}

if（* cp =='';''）{

items ++;

if（chars！= 14）{

return（-1）; / *错误* /

}

chars = 0;

}

else if（++ chars> ; 14）{

return（-1）; / *错误* /

}

*（cp-shifting）= * cp;

}

* （cp-shifting）=''\ 0'';

if（chars == 14）{

items ++;

}

if（！items ||（chars&& chars！= 14））{

return（-1）; / *错误* /

}

printf（"字符串''％s''有％d项。"，newstr，items）;

free（newstr）;

return（0）; / *成功* /

}

还有更好的办法吗？

-

Christopher Benson-Manica |在你的命运转向轮子，

ataru（at）cyberspace.org |在你的课上学习。

I''m wondering about the best way to do the following:

I have a string delimited by semicolons. The items delimited may be in any of
the following formats:
1) 14 alphanum characters
2) 5 alphanums space 8 alphanums
3) 6 alphanums colon 8 alphanums
4) 5 alphanums colon 8 alphanums

My task is to convert items in the third format to the first format, and items
in the fourth format to the second. Also, I need to count the number of items
in the string, which may or may not have a trailing semicolon.

My plan (which I feel is sub-optimal - hence this post), is to step through
the initial string one character at a time to accomplish these things in one
pass. While I could count semicolons easily with strchr(), deleting the
colons properly means stepping through the whole string anyway (right?) and so
I may as well count semicolons simultaneously. I''d also like to validate the
data format (i.e., 15-character items are not allowed).

int myfunc( const char *list )
{
int items=0;
char *cp=strdup( idlist ); /* nonstandard */
char *newstr=cp;
int shifts=0;
int chars=0;

for( ; *cp ; *cp++ ) {
if( *cp == '':'' ) {
if( chars == 6 ) {
shifts++;
continue;
}
if( chars == 5 ) {
*(cp-shifts)='' '';
chars++;
continue;
}
return( -1 ); /* error */
}
if( *cp == '';'' ) {
items++;
if( chars != 14 ) {
return( -1 ); /* error */
}
chars=0;
}
else if( ++chars > 14 ) {
return( -1 ); /* error */
}
*(cp-shifts)=*cp;
}
*(cp-shifts)=''\0'';
if( chars == 14 ) {
items++;
}
if( !items || (chars && chars != 14) ) {
return( -1 ); /* error */
}
printf( "The string ''%s'' has %d items.", newstr, items );
free( newstr );
return( 0 ); /* success */
}

Is there a better way?

--
Christopher Benson-Manica | Upon the wheel thy fate doth turn,
ataru(at)cyberspace.org | upon the rack thy lesson learn.

推荐答案

在< bm ********** @ chessie.cirr.com> Christopher Benson-Manica< at *** @ nospam.cyberspace.org>写道：

In <bm**********@chessie.cirr.com> Christopher Benson-Manica <at***@nospam.cyberspace.org> writes:

我想知道如何做到以下几点的最好方法：

我有一个用分号分隔的字符串。分隔的项目可以采用以下任何格式：
1）14个孤立字符
2）5个alphanums空间8个alphanums
3）6个alphanums冒号8个alphanums 4）5个alphanums冒号8个alphanums

我的任务是将第三种格式的项目转换为第一种格式，将第四种格式的项目转换为第二种格式。另外，我需要计算字符串中的项目数量，这些项目可能有也可能没有带分号。

我的计划（我认为这是次优的 - 因此这篇文章），是逐步通过
初始字符串一个字符来完成这些事情的一次传递。虽然我可以使用strchr（）轻松计算分号，但正确删除
冒号意味着无论如何都要踩过整个字符串（对吗？）所以
我也可以同时计算分号。我也想验证
数据格式（即不允许使用15个字符的项目）。

int myfunc（const char * list）
{
int items = 0;
char * cp = strdup（idlist）; / * nonstandard * /
char * newstr = cp;
int shifting = 0;
int chars = 0;

for（; * cp; * cp ++） {
如果（* cp ==''：''）{
if（chars == 6）{
转换++;
继续;
} if（chars == 5）{
*（cp-shifting）='''';
chars ++;
继续;
}
返回（-1 ）; / *错误* /
}
if（* cp =='';''）{
项目++;
if（chars！= 14）{
返回（-1）; / *错误* /
}
chars = 0;
}
if if（++ chars> 14）{
return（-1）; / *错误* /
}
*（cp-shifting）= * cp;
}
*（cp-shifting）=''\ 0'';
if（chars == 14）{
items ++;
}
if（！items ||（chars&& chars！= 14））{
return（ -1）; / *错误* /
}
printf（"字符串''％s''有％d项。"，newstr，items）;
免费（newstr）;
return（0）; / *成功* /
}

有更好的方法吗？

I''m wondering about the best way to do the following:

I have a string delimited by semicolons. The items delimited may be in any of
the following formats:
1) 14 alphanum characters
2) 5 alphanums space 8 alphanums
3) 6 alphanums colon 8 alphanums
4) 5 alphanums colon 8 alphanums

My task is to convert items in the third format to the first format, and items
in the fourth format to the second. Also, I need to count the number of items
in the string, which may or may not have a trailing semicolon.

My plan (which I feel is sub-optimal - hence this post), is to step through
the initial string one character at a time to accomplish these things in one
pass. While I could count semicolons easily with strchr(), deleting the
colons properly means stepping through the whole string anyway (right?) and so
I may as well count semicolons simultaneously. I''d also like to validate the
data format (i.e., 15-character items are not allowed).

int myfunc( const char *list )
{
int items=0;
char *cp=strdup( idlist ); /* nonstandard */
char *newstr=cp;
int shifts=0;
int chars=0;

for( ; *cp ; *cp++ ) {
if( *cp == '':'' ) {
if( chars == 6 ) {
shifts++;
continue;
}
if( chars == 5 ) {
*(cp-shifts)='' '';
chars++;
continue;
}
return( -1 ); /* error */
}
if( *cp == '';'' ) {
items++;
if( chars != 14 ) {
return( -1 ); /* error */
}
chars=0;
}
else if( ++chars > 14 ) {
return( -1 ); /* error */
}
*(cp-shifts)=*cp;
}
*(cp-shifts)=''\0'';
if( chars == 14 ) {
items++;
}
if( !items || (chars && chars != 14) ) {
return( -1 ); /* error */
}
printf( "The string ''%s'' has %d items.", newstr, items );
free( newstr );
return( 0 ); /* success */
}

Is there a better way?

1.这样的代码是维护噩梦（想象一下，你将不得不在5年后做出一些改变。

2.我可能会遗漏一些东西，但我可以'找不到任何试验的尝试

你的角色真的是alphanums，你只是在寻找你的

分隔符。

我将使用sscanf调用实现此功能。结果会比较慢，但更可读。

字母数字转换说明符可以使用以下宏：

#define ALNUM" [abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWX YZ0123456789]"

Dan

-

Dan Pop

DESY Zeuthen，RZ集团

电子邮件： Da ***** @ ifh.de

Christopher Benson-Manica写道：

Christopher Benson-Manica wrote:

我想知道如何做到以下几点的最好方法：

我有一个由分号分隔的字符串。分隔的项目可以采用以下任何格式：
1）14个孤立字符
2）5个alphanums空间8个alphanums
3）6个alphanums冒号8个alphanums 4）5个alphanums冒号8个alphanums

我的任务是将第三种格式的项目转换为第一种格式，将第四种格式的项目转换为第二种格式。另外，我需要计算字符串中的项目数量，这些项目可能有也可能没有带分号。

我的计划（我认为这是次优的 - 因此这篇文章），是逐步通过
初始字符串一个字符来完成这些事情的一次传递。虽然我可以使用strchr（）轻松计算分号，但正确删除
冒号意味着无论如何都要踩过整个字符串（对吗？）所以
我也可以同时计算分号。我还想验证
数据格式（即不允许使用15个字符的项目）。
[代码剪辑]

有更好的方法吗？

I''m wondering about the best way to do the following:

I have a string delimited by semicolons. The items delimited may be in any of
the following formats:
1) 14 alphanum characters
2) 5 alphanums space 8 alphanums
3) 6 alphanums colon 8 alphanums
4) 5 alphanums colon 8 alphanums

My task is to convert items in the third format to the first format, and items
in the fourth format to the second. Also, I need to count the number of items
in the string, which may or may not have a trailing semicolon.

My plan (which I feel is sub-optimal - hence this post), is to step through
the initial string one character at a time to accomplish these things in one
pass. While I could count semicolons easily with strchr(), deleting the
colons properly means stepping through the whole string anyway (right?) and so
I may as well count semicolons simultaneously. I''d also like to validate the
data format (i.e., 15-character items are not allowed). [code snipped]

Is there a better way?

另一种方法是像语言一样解析字符串。分析

数据以查找其当前格式，然后应用转换。

让我们仔细观察格式。让A代表任意字符

来自字母数字集合。

[1] AAAAAAAAAAAAAA

[2] AAAAA AAAAAAAA

[3] AAAAAA：AAAAAAAA

[4] AAAAA：AAAAAAAA

查看以上几行，格式在6日不同

列（从第1列开始作为第一列）。

变体是：

第6个字符格式数

------ - -------------

''：''4

''''2

A 1或3

最后一个值需要查看第7列：

7th char格式编号

--------- ------------

''：''3

A 1

基于这个分析，格式选择看起来很容易。

格式转换留给读者和& OP。

Format1 :: = AlphaNum AlphaNum {...} AlphaNum

Format2 :: = AlphaNum AlphaNum AlphaNum AlphaNum

AlphaNum''''

等等。您可以尝试使用Lexer工具，例如

Yacc和Lexx（Bison和Flex）。

-

托马斯马修斯

C ++新闻组欢迎辞：

http ：//www.slack.net/~shiva/welcome.txt

C ++常见问题：http：//www.parashift.com/c++-faq-lite

C常见问题：http：//www.eskimo.com/~scs/c-faq/top.html

alt.comp.lang.learn.c-c ++ faq：

http://www.raos.demon.uk/acllc-c++/faq.html

其他网站：

http://www.josuttis.com - C ++ STL图书馆书

Another method would be parse the string like a language. Analyze the
data to find its current format, then apply the conversion.

Let''s look closer at the formats. Let A represent any character
from the set of alphanumerics.
[1] AAAAAAAAAAAAAA
[2] AAAAA AAAAAAAA
[3] AAAAAA:AAAAAAAA
[4] AAAAA:AAAAAAAA
Looking at the above lines, the formats differ at the 6th
column (starting with column 1 as the first column).
The variations are:
6th char Format Number
-------- -------------
'':'' 4
'' '' 2
A 1 or 3
This last value requires looking at column 7:
7th char Format Number
-------- -------------
'':'' 3
A 1

Based on this analysis, format selection looks easy.
Format conversion is left for the reader & OP.

Format1 ::= AlphaNum AlphaNum {...} AlphaNum

Format2 ::= AlphaNum AlphaNum AlphaNum AlphaNum
AlphaNum '' ''

Etc. You could try using a Lexer tool, such as
Yacc and Lexx (Bison and Flex).

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book

你看过strspn和strcspn吗？后者将找到（下一个）
分号，前者可以验证当前

到分号的字符都是字母数字。

char * alnum =" abcdefghijklmnopqrstuvwxyz"

" ABCDEFGHIJKLMNOPQRSTUVWXYZ"

" 0123456789";

size_t tokenLength（char * tkn）

{

size_t len，half;

if（！tkn ）

return（size_t）0;

len = strlen（tkn）;

semi = strcspn（tkn，" ;"）;

if（半== len）//没有分号

返回（size_t）0;

if（strspn（tkn，alnum）！= semi）

return（size_t）0; //不是所有的字母数字

返回半个;

}

-

#include< standard.disclaimer>

_

Kevin D Quitt USA 91387-4454 96.37％的统计数据构成

根据FCA，此地址不得添加到任何商业邮件列表中

Have you looked at strspn and strcspn? The latter will locate the (next)
semi-colon, and the former can verify that the characters from the current
to the semi-colon are all alphanumerics.

char *alnum = "abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"0123456789";

size_t tokenLength( char *tkn )
{
size_t len, semi;

if ( !tkn )
return (size_t)0;

len = strlen( tkn );
semi = strcspn( tkn, ";" );
if ( semi == len ) // There''s no semi-colon
return (size_t)0;

if ( strspn( tkn, alnum ) != semi )
return (size_t)0; // Not all alpha-num

return semi;
}

--
#include <standard.disclaimer>
_
Kevin D Quitt USA 91387-4454 96.37% of all statistics are made up
Per the FCA, this address may not be added to any commercial mail list

这篇关于字符串解析问题的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

字符串解析问题 [英] String parsing question

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

字符串解析问题 [英] String parsing question

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭