使用sed删除C/C ++文件上的功能主体 [英] Using sed to remove bodies of functions on a C/C++ file

查看:101
本文介绍了使用sed删除C/C ++文件上的功能主体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从源文件创建一个具有所有功能/枚举/结构/等名称的文件.为此,我目前正尝试使用sed完成类似的操作:

I am trying to create a file with all function/enum/struct/etc names from a source file. For that, I am at the moment trying to use sed to accomplish something like this:

(原始文件)

function add1 (int i) {
    return i+1;
}

(sed的输出)

function add1 (int i) {
}

换句话说,我想删除函数主体的实际内容.到目前为止,我无法使它正常工作.有什么建议吗?

In other words, I want to remove the actual contents of the function's body. I could so far not get it to work. Any suggestions?

编辑:我尝试过类似的尝试,但没有成功(目前,我仅尝试将函数主体上的行设置为空白):

EDIT: I tried something like this, with no success (for now I am trying to only make the lines on the function's body blank):

sed '/{/,/}/ s/.*//'

推荐答案

您可以始终在每个字符字段模式(FS="")中使用awk,而不是sed:

Instead of sed, you could always use awk in per-character field mode (FS=""):

awk 'BEGIN {
         RS = "\n" ;
         FS = "" ;
         d = 0 ;
     }

     {
         for (i=1; i<=NF; i++)
             if ($i == "{") {
                 d++ ;
                 if (d == 1) printf "{\n"
             } else
             if ($i == "}") {
                 d-- ;
                 if (d == 0) printf "}"
             } else
             if (d == 0)
                 printf "%s", $i ;
         if (d == 0) printf "\n"
     }' INPUT-FILE(s)...

以上内容将跳过任何成对的花括号(即函数和结构体,数组初始化等)的内容,并将结果输出到标准输出.您可以指定一个或多个文件. (如果您未指定任何文件,则期望输入来自标准输入.)

The above will skip the contents of any paired curly braces, i.e. function and structure bodies, array initializations, and so on, and output the result to standard output. You can specify one or more files. (If you don't specify any files, it'll expect input from standard input.)

现在,它对引号或注释中的花括号感到困惑.可以用相同的方法解决此问题,但确实很快变得非常复杂.这只是使您获得最大收益的一种技巧.

As it is now, it will get confused about braces within quotes or comments. That could be fixed in the same way, but it does get quite complicated fast. This is just a hack to get you most of the way.

我添加了分号(;),这样您就可以在一个较长的命令行中将以上代码片段中的所有内容塞满.

I added the semicolons (;) so you can just stuff everything in the above snippet on one long command line.

脚本的逻辑非常简单.它使用空字段分隔符(FS),因此输入中的每个字符都将是它们自己的字段. BEGIN规则在处理任何输入之前运行一次,并进行设置.有关开发人员的信息,我还初始化了d = 0,尽管对于awk而言并不必要,因为它假定未初始化的变量为空或为零.它将跟踪每个输入字符的当前支撑深度.

The logic of the script is very simple. It uses the empty field separator (FS), so that every character in input will be their own field. The BEGIN rule is run once before any input is processed, and sets this up. For developer information, I also initialize d = 0 although it is not necessary for awk since it assumes uninitialized variables to be empty or zero as appropriate. It will track the current brace depth for each input character.

第二个括号表达式将每条记录执行一次.因为我设置了RS = "\n",所以每一行都是一个单独的表达式.因此,每条输入线将执行一次.由于FS = "",该行上的每个字符将是一个单独的字段.记录中有NF字段:$1$2,..,$(NF-1)$NF.由三部分组成的if子句仅输出最外面的括号,而所有内容都不在括号内(例如,当d == 0时).

The second braced expression will be executed once per every record. Since I set RS = "\n", each line is a separate expression. Thus, it will be executed once per input line. Due to FS = "", each character on that line will be a separate field. There are NF fields in the record: $1, $2, .., $(NF-1), and $NF. The three-part if clause simply outputs outermost braces, and everything not within braces (i.e. when d == 0).

可以扩展此awk脚本,使其包含注释,字符串,字符常量(除非使用#!/usr/bin/awk -f将脚本放入单独的文件中,否则请使用\047引用单引号),并处理或忽略预处理器宏.

It is possible to extend this awk scriptlet to encompass comments, strings, character constants (use \047 to refer to a single quote, unless you put the script into a separate file with #!/usr/bin/awk -f), and to process or ignore preprocessor macros.

它确实有点复杂,您最终会得到几百行awk脚本,但是它应该是相当可靠且相当快的.之所以可能,是因为在这种特殊情况下,C语言中的分词规则很容易遵循.我个人将在其他所有用例中使用成熟的C词法分析器(词法分析器或扫描器).可能也为此.

It does get a bit complicated, and you'll end up with a couple of hundred lines of awk script, but it should be quite reliable and reasonably fast. The reason it is possible is because the tokenization rules in C in this particular case are easy to follow; I personally would use a full-blown C lexer (lexical analyzer or scanner) in all other use cases. And probably for this, too.

如果您想使用成熟的C词法分析器,网上有很多免费的工具,但是您必须使用高级语言,例如C或C ++.如果您希望处理所有极端情况,那么也需要结合使用C/C ++预处理器,但是这些规则很容易(即使使用awk也是如此).

If you want to use a full-blown C lexer, there are a number of them available freely on the net, but you'll have to use a higher level language like C or C++. If you wish to handle all the corner cases, it'll need to incorporate a C/C++ preprocessor, too, but those rules are easy (even with awk).

这篇关于使用sed删除C/C ++文件上的功能主体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆