使用 sed 转换 C 结构体和 typedef [英] Using sed to transform a C struct and typedef

查看:34
本文介绍了使用 sed 转换 C 结构体和 typedef的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的输入代码中有几个结构定义.例如:

I have a couple structure definitions in my input code. For example:

struct node {
   int val;
   struct node *next;
};

typedef struct {
   int numer;
   int denom;
} Rational;

我用下面这行将它们转换成一行并复制两次.

I used the following line to convert them into one line and copy it twice.

sed '/struct[^(){]*{/{:l N;s/\n//;/}[^}]*;/!t l;s/  */ /g;p;p}'

结果如下:

struct node { int val; struct node *next;};
struct node { int val; struct node *next;};
struct node { int val; struct node *next;};

typedef struct { int numer; int denom;} Rational;
typedef struct { int numer; int denom;} Rational;
typedef struct { int numer; int denom;} Rational;

这就是我想要的:

  1. 我想将第一行恢复到原来的结构块
  2. 我希望第二行变成一个函数标题,看起来像这样...

  1. I would like the first line to be restored to the original structure block
  2. I would like the second line to turn into to a function heading that looks like this...

void init_structName( structName *var, int data1, int data2 ) 

-structName 基本上是结构的名称.

-structName is basically the name of the structure.

-var 是您喜欢的任何名称.

-var is any name you like.

-data1, data2.... 是结构体中的值.

-data1, data2.... are values that are in the struct.

3.我想第三行变成函数体.我在哪里初始化数据参数.它看起来像这样.

3.I would like the third line to turn into to the function body. Where I initialize the the data parameters. It would look like this.

    {
        var->data1 = data1;
        var->data2 = data2;
    }

请记住,我在输入文件中的所有结构定义都放在一行中并复制了 3 次.因此,当代码找到结构定义时,它可以假设下面还有两个副本.

Keep in mind that ALL my struct definitions in the input file are placed in one line and copied three times. So when the code finds a structure defintion it can assume that there will be two more copies below.

例如,如果输入文件有上面显示的重复行,这就是我想要的输出.

For example, this is the output I want if the input file had the repeating lines shown above.

struct node {
   int val;
   struct node *next;
};
void init_node(struct node *var, int val, struct node *next)
{
var->val = val;
var->next =  next;
}

typedef struct {
   int numer;
   int denom;
} Rational;
void init_Rational( Rational *var, int numer, int denom ) 
{
   var->numer = numer;
   var->denom = denom;
}

以防有人好奇.这些函数将从主函数中调用以初始化结构体变量.

In case someone was curious. These functions will be called from the main function to initialize the struct variables.

有人可以帮忙吗?我意识到这有点艰难.非常感谢!!

Can someone help? I realize this is kind of tough. Thanks so much!!

推荐答案

看到 sed 是图灵完备,可以一次性完成,但这并不意味着代码非常人性化=)

Seeing that sed is Turing Complete, it is possible to do it in a single go, but that doesn't mean that the code is very user friendly =)

我对解决方案的尝试是:

My attempt at a solution would be:

#!/bin/sed -nf

/struct/b continue
p
d

: continue

# 1st step:
s/\(struct\s.*{\)\([^}]*\)\(}.*\)/\1\
\2\
\3/
s/;\(\s*[^\n}]\)/;\
\1/g
p

s/.*//
n
# 2nd step:
s/struct\s*\([A-Za-z_][A-Za-z_0-9]*\)\s*{\([^}]*\)}.*/void init_\1(struct \1 *var, \2)/
s/typedef\s*struct\s*{\([^}]*\)}\s*\([A-Za-z_][A-Za-z_0-9]*\)\s*;/void init_\2(struct \2 *var, \1)/
s/;/,/g
s/,\s*)/)/
p

s/.*//
n
# 3rd step
s/.*{\s*\([^}]*\)}.*/{\
\1}/
s/[A-Za-z \t]*[\* \t]\s*\([A-Za-z_][A-Za-z_0-9]*\)\s*;/\tvar->\1 = \1;\
/g
p

我会尽力解释我所做的一切,但首先我必须警告说,这可能不是很笼统.例如,它假设三个相同的线彼此跟随(即它们之间没有其他线).

I'll try to explain everything I did, but firstly I must warn that this probably isn't very generalized. For example, it assumes that the three identical lines follow each other (ie. no other line between them).

在开始之前,请注意该文件是一个需要-n"标志才能运行的脚本.这告诉 sed 不要将任何内容打印到标准输出,除非脚本明确告诉它(例如,通过p"命令).-f"选项是告诉 sed 打开后面文件的技巧".使用./myscript.sed"执行脚本时,bash 将执行/bin/sed -nf myscript.sed",因此它将正确读取脚本的其余部分.

Before starting, notice that the file is a script that requires the "-n" flag to run. This tells sed to not print anything to standard output unless the script explicitly tells it to (through the "p" command, for example). The "-f" options is a "trick" to tell sed to open the file that follows. When executing the script with "./myscript.sed", bash will execute "/bin/sed -nf myscript.sed", so it will correctly read the rest of the script.

第 0 步只是检查我们是否有有效行.我假设每个有效行都包含单词 struct.如果该行有效,则脚本分支(跳转,b"命令相当于 C 中的 goto 语句)到 continue 标签(与 C 不同,标签以:"开头,而不是以它结尾).如果它无效,我们用p"命令强制打印它,然后用d"命令从模式空间中删除该行.通过删除该行,sed 将读取下一行并从头开始执行脚本.

Step zero would be just a check to see if we have a valid line. I'm assuming every valid line contains the word struct. If the line is valid, the script branches (jumps, the "b" command is equivalent to the goto statement in C) to the continue label (differently from C, labels start with ":", rather than ending with it). If it isn't valid, we force it to be printed with the "p" command, and then delete the line from pattern space with the "d" command. By deleting the line, sed will read the next line and start executing the script from the beginning.

如果行有效,则开始更改行的操作.第一步是生成结构体.这是通过一系列命令完成的.

If the line is valid, the actions to change the lines start. The first step is to generate the struct body. This is done by a series of commands.

  1. 将该行分成三部分,从左括号开始,从右括号开始(但不包括它),从右括号开始(现在包括它).我应该提到 sed 的一个怪癖是我们用\n"搜索换行符,但用\"写换行符,然后是实际的换行符.这就是为什么这个命令被分成三个不同的行.IIRC 此行为特定于 POSIX sed,但可能 GNU 版本(存在于大多数 Linux 发行版中)允许使用\n"编写换行符.
  2. 在每个分号后添加一个换行符.这个工作有点尴尬,我们在分号后插入换行符后复制分号后的所有内容.g 标志告诉 sed 重复执行此操作,这就是它起作用的原因.还要再次注意换行符.
  3. 强制打印结果

在第二步之前,我们手动清除模式空间(即缓冲区)中的行,以便我们可以重新开始下一行.如果我们使用d"命令执行此操作,sed 将再次从文件开头读取命令.n"命令然后将下一行读入模式空间.之后,我们启动将行转换为函数声明的命令:

Before the second step, we manually clear the lines from the pattern-space (ie. buffer), so we can start fresh for the next line. If we did this with the "d" command, sed would start reading the commands from the start of the file again. The "n" command then reads the next line into the pattern-space. After that, we start the commands to transform the line into a function declaration:

  1. 我们首先匹配单词 struct,然后是零个或多个空格,然后是一个可以以下划线或字母开头的 C 标识符,并且可以包含下划线和字母数字字符.标识符被捕获到变量"\1"中.然后我们匹配括号之间的内容,该内容存储在\2"中.然后使用这些来生成函数声明.
  2. 然后我们执行相同的过程,但现在针对typedef"情况.请注意,现在标识符位于括号之后,因此\1"现在包含括号内的内容,\2"包含标识符.
  3. 现在我们用逗号替换所有分号,这样它就可以开始看起来更像一个函数定义了.
  4. 最后一个替换命令删除右括号前的多余逗号.
  5. 最后打印结果.

再次,在最后一步之前,手动清理模式空间并阅读下一行.然后该步骤将生成函数体:

Again, before the last step, manually clean the pattern-space and read the next line. The step will then generate the function body:

  1. 匹配并捕获括号内的所有内容.注意左括号之前和右括号之后的.*".使用它是为了后面只写括号的内容.在编写输出时,我们将左括号放在单独的一行中.
  2. 我们匹配字母字符和空格,因此我们可以跳过类型声明.我们至少需要一个空格字符或一个星号(用于指针)来标记标识符的开始.然后我们继续捕获标识符.这只适用于捕获之后的内容:我们明确要求在标识符之后只有可选的空格,后跟一个分号.这会强制表达式在分号之前获取标识符字符,即.如果有两个以上的词,它只会得到最后一个词.因此,它可以与unsigned int var"一起使用,正确捕获var".在编写输出时,我们会先进行一些缩进,然后是所需的格式,包括转义的换行符.
  3. 打印最终输出.

我不知道我是否足够清楚.请随时提出任何说明.

I don't know if I was clear enough. Feel free to ask for any clarifications.

希望这有帮助 =)

这篇关于使用 sed 转换 C 结构体和 typedef的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆