sed用/ * * /注释替换//当//注释出现在/ * *中时EXCEPT / [英] sed to replace // with /* */ comments EXCEPT when // comments appear within /* */
问题描述
我面临的问题是使用ANSI编译器,需要C风格的注释。
所以我试图转换我现有的注释符合C标准ISO C89。
我正在寻找一个SED表达式用/ *注释替换//注释当//注释出现在/ * * / comments中时评论)。
我试过这个(范围表达式)无效:
sed -e'/ \ / * /,/ * \ //! s _ // \(。* \)_ / * \1 * / _'>文件名
有什么工作可以忽略这行注释中的1行注释, / p>
/ ************************** ********
*一些注释
*注释中的一个示例位,后跟一个//注释
*一些注释
**** ******************************* /
y = x + 7; //此评论必须更改
谢谢!
这是一个轻度测试的过滤器,用C语言来执行你想要的转换。关于这个过滤器是什么,很难,如果不是不可能用正则表达式处理:
- 它忽略了封闭的注释类序列
- 如果正在转换的C99注释包含可以开始或结束C89注释的内容,则它会将该序列隐藏起来,因此不会有嵌套注释或提前结束注释(嵌套
/ *
或* /
变为/ +
或/ |
)。我不知道你是否需要这个或不是(如果你不需要,它应该很容易删除) - 上面的嵌套注释的调用只发生在C99注释转换 - 已经是C89风格的评论的内容不会改变。
- 它不处理三字母或二字图(我认为这只允许丢失转义序列或结束行 c> ,您需要执行自己的测试,以确定它是否适合您的目的。
#include< stdio.h> ;
char * a =this is / * test of \junk // embedded in a'\'string;
char * b =it should be left alone //;
//注释/ *应该*** ////转换。
/ *单独留这个// //但修复这个
//和在注释中留下这些\'\引号*
/ ****和这些'\'太//
* /
枚举状态{
state_normal,
state_double_quote,
state_single_quote,
state_c89_comment,
state_c99_comment
};
枚举状态current_state = state_normal;
void handle_char(char ch)
{
static char last_ch = 0;
开关){
case state_normal:
if((last_ch =='/')&&(ch =='/')){
putchar('*'); / *注意:改变为C89样式注释* /
current_state = state_c99_comment;
}
else if((last_ch =='/')&&(ch =='*')) {
putchar(ch);
current_state = state_c89_comment;
}
else if(ch =='\'){
putchar(ch);
current_state = state_double_quote;
}
else if (ch =='\''){
putchar(ch);
current_state = state_single_quote;
}
else {
putchar(ch);
}
break;
case state_double_quote:
if((last_ch =='\\')&&(ch =='\\ ')){
/ *我们想输出这个\\转义的序列,但是我们* /
/ *不想'记住'当前的反斜杠 - * /
/ *否则我们会错误地将下一个字符* /
/ *作为转义处理* /
putchar(ch);
ch = 0;
}
else if((ch =='\')&&(last_ch!='\\')){
putchar(ch);
current_state = state_normal;
}
else {
putchar(ch);
}
break;
case state_single_quote:
if((last_ch =='\\')&&(ch =='\\')){
/ *我们想输出这个\\转义的序列,但是我们* /
/ *不想'记住'当前的反斜杠 - * /
/ *,否则我们会错误地将下一个字符* /
/ *作为转义* /
putchar(ch);
ch = 0;
}
else if((ch =='\'')&&(last_ch!='\\')){
putchar(ch);
current_state = state_normal;
}
else {
putchar(ch);
}
break;
case state_c89_comment:
if((last_ch =='*')&&(ch =='/')){
putchar(ch);
ch = 0; / *'忘记'斜杠,因此它不会影响紧跟在后面的可能斜杠* /
current_state = state_normal;
}
else {
putchar(ch);
}
break;
case state_c99_comment:
if((last_ch =='/')&&(ch =='*')){
/ *我们要更改任何* /
/ *什么是C99注释,以避免* /
/ *嵌套注释* /
putchar('+');
}
else if((last_ch =='*')&&(ch =='/')){
/ *类似于* b $ b / *什么是C99注释* /
putchar('|');
}
else if(ch =='\\\
'){
puts(* /);
current_state = state_normal;
}
else {
putchar(ch);
}
break;
}
last_ch = ch;
}
int main(void)
{
int c;
while((c = getchar())!= EOF){
handle_char(c);
}
return 0;
}
一些沉溺的评论:很多年前,我在一家商店一个编码标准,禁止C99风格的评论,因为即使我们当时使用的编译器没有问题,代码可能必须移植到不支持它们的编译器。我(和其他人)成功地认为,这种可能性是如此遥远,以至于基本上不存在,即使它确实发生,一个转换程序使评论兼容可以很容易地写。我们被允许使用C99 / C ++风格的评论。
我现在认为我的誓言已经完成,以及可能已经奠定在我身上的任何诅to。
The problem I am facing is with an ANSI compiler that requires C style comments.
So I am trying to convert my existing comments to comply with the C standard ISO C89.
I am looking for a SED expression to replace // comments with /* comments EXCEPT when // comments appear within /* */ comments (which would break the comment).
I have tried this (a range expression) to no avail:
sed -e '/\/*/,/*\//! s_//\(.*\)_/*\1 */_' > filename
Will something work to ignore the 1 line comments inside a comment like this but change everything else?
/********************************** * Some comment * an example bit of code within the comment followed by a //comment * some more comment ***********************************/ y = x+7; //this comment must be changed
Thanks!
解决方案Here's a lightly tested filter written in C that should perform the conversion you want. Some comments about what this filter does that are difficult if not impossible to handle with a regex:
- it ignores comment-like sequences that are enclosed in quotes (since they aren't comments)
- if a C99 comment that is being converted contains something that would start or end a C89 comment, it munges that sequence so there will be no nested comment or premature end to the comment (a nested
/*
or*/
gets changed to/+
or/|
). I wasn't sure if you needed this or not (if you don't, it should be easy to remove) - the above munging of nested comments only occurs in a C99 comment that's being converted - the contents of comments that are already C89 style are not changed.
- it does not handle trigraphs or digraphs (I think this only allows the possibility of missing an escape sequence or end of line continuation that is initiated with the trigraph
??/
).
Of course, you'll need to perform your own testing to determine if it's suitable for your purposes.
#include <stdio.h> char* a = " this is /* a test of \" junk // embedded in a '\' string"; char* b = "it should be left alone//"; // comment /* that should ***//// be converted. /* leave this alone*/// but fix this one // and "leave these \' \" quotes in a comment alone* /**** and these '\' too // */ enum states { state_normal, state_double_quote, state_single_quote, state_c89_comment, state_c99_comment }; enum states current_state = state_normal; void handle_char( char ch) { static char last_ch = 0; switch (current_state) { case state_normal: if ((last_ch == '/') && (ch == '/')) { putchar( '*'); /* NOTE: changing to C89 style comment */ current_state = state_c99_comment; } else if ((last_ch == '/') && (ch == '*')) { putchar( ch); current_state = state_c89_comment; } else if (ch == '\"') { putchar( ch); current_state = state_double_quote; } else if (ch == '\'') { putchar( ch); current_state = state_single_quote; } else { putchar( ch); } break; case state_double_quote: if ((last_ch == '\\') && (ch == '\\')) { /* we want to output this \\ escaped sequence, but we */ /* don't want to 'remember' the current backslash - */ /* otherwise we'll mistakenly treat the next character*/ /* as being escaped */ putchar( ch); ch = 0; } else if ((ch == '\"') && (last_ch != '\\')) { putchar( ch); current_state = state_normal; } else { putchar( ch); } break; case state_single_quote: if ((last_ch == '\\') && (ch == '\\')) { /* we want to output this \\ escaped sequence, but we */ /* don't want to 'remember' the current backslash - */ /* otherwise we'll mistakenly treat the next character*/ /* as being escaped */ putchar( ch); ch = 0; } else if ((ch == '\'') && (last_ch != '\\')) { putchar( ch); current_state = state_normal; } else { putchar( ch); } break; case state_c89_comment: if ((last_ch == '*') && (ch == '/')) { putchar( ch); ch = 0; /* 'forget' the slash so it doesn't affect a possible slash that immediately follows */ current_state = state_normal; } else { putchar( ch); } break; case state_c99_comment: if ((last_ch == '/') && (ch == '*')) { /* we want to change any slash-star sequences inside */ /* what was a C99 comment to something else to avoid */ /* nested comments */ putchar( '+'); } else if ((last_ch == '*') && (ch == '/')) { /* similarly for star-slash sequences inside */ /* what was a C99 comment */ putchar( '|'); } else if (ch == '\n') { puts( "*/"); current_state = state_normal; } else { putchar( ch); } break; } last_ch = ch; } int main(void) { int c; while ((c = getchar()) != EOF) { handle_char( c); } return 0; }
Some indulgent commentary: many years ago, a shop I worked at wanted to impose a coding standard that forbade C99-style comments on the grounds that even though the compiler we used at the time had no problem with them, the code might have to be ported to a compiler that didn't support them. I (and others) successfully argued that that possibility was so remote as to be essentially non-existant, and that even if it did happen, a conversion routine to make the comments compatible could be easily written. We were permitted to use C99/C++ style comments.
I now consider my oath fulfilled, and whatever curse that may have been laid on me to be lifted.
这篇关于sed用/ * * /注释替换//当//注释出现在/ * *中时EXCEPT /的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!