如何使用python,perl或sed从头文件中提取注释? [英] How to extract comment out of header file using python, perl, or sed?

查看:231
本文介绍了如何使用python,perl或sed从头文件中提取注释?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个头文件,像这样:

I have a header file like this:

/*
 * APP 180-2 ALG-254/258/772 implementation
 * Last update: 03/01/2006
 * Issue date:  08/22/2004
 *
 * Copyright (C) 2006 Somebody's Name here
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. Neither the name of the project nor the names of its contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 */

#ifndef HEADER_H
#define HEADER_H

/* More comments and C++ code here. */

#endif /* End of file. */

我想提取第一 C样式注释,并在每行的开头放置*,以获取包含以下内容的文件:

And I wish to extract out the contents of the first C style comment only and drop the " *" at the start of each line to get a file with the following contents:

 APP 180-2 ALG-254/258/772 implementation
 Last update: 03/01/2006
 Issue date:  08/22/2004

 Copyright (C) 2006 Somebody's Name here
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 3. Neither the name of the project nor the names of its contributors
    may be used to endorse or promote products derived from this software
    without specific prior written permission.

 THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 ARE DISCLAIMED.  IN NO EVENT SHALL THE PROJECT OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGE.

请使用Python,Perl,sed或其他方式在Unix上建议一个简单的方法。

Please suggest an easy way to do this with Python, Perl, sed, or some other way on Unix. Preferably as a one-liner.

推荐答案

这应该适用于您:

sed -n '/\*\//q; /^\/\*/d; s/^ \* \?//p' <file.h >comment.txt

这里有一个解释:sed(你可能知道)是一个命令,通过一个文件应用规则列表到每一行。

Here's an explanation: sed (as you may know) is a command that goes through a file applying a list of rules to each line. Each rule consists of a "selector" and commands that are applied to that line only if the selector matches.

第一个规则有选择器 /

The first rule has the selector /\*\//. This is a regular expression selector; it matches any line that contains the characters */. Both of these need to be backslash-escaped since they have special meanings in a regexp. (I've assumed that this will only match the closing line of the comment in your case and that this entire line should be deleted.) The command is q which means "quit." sed just stops. Ordinarily it would print out the line, but I provided the -n option which means "don't print unless explicitly instructed to."

第二条规则有选择器 / ^ \ / \ * / ,它再次是一个regexp选择器,匹配字符 / * 在行的开头。再次,我假设这行不包含评论的一部分。 d 命令告诉sed删除此行并继续。

The second rule has the selector /^\/\*/ which is again a regexp selector that matches the characters /* at the start of the line. Again, I've assumed this line will not contain part of the comment. The d command tells sed to delete this line and move on.

最终规则没有选择器,因此适用于所有行(除非先前的命令阻止处理达到最终规则)。最后一个规则中的命令是替换命令 s / PATTERN / REPLACEMENT / ,它在匹配某个模式的行中找到文本,并将其替换为替换文本。这里的模式是 ^ \ * \?,它匹配空格,星号和0或1个空格,但只在行的开头。而更换是什么。所以sed简单地删除前导空格星号(空格)?序列。 p 实际上是替换命令的标志,它告诉sed打印出替换的结果。这是因为 -n 选项。

The final rule has no selector, so it applies to all lines (unless a previous command prevented processing from reaching the final rule). The command in this last rule is a substitution command, s/PATTERN/REPLACEMENT/, which finds text in the line that matches some pattern and replaces it with a replacement text. The pattern here is ^ \* \?, which matches a space, an asterisk, and either 0 or 1 spaces, but only at the beginning of the line. And the replacement is nothing. So sed simply deletes the leading space-asterisk-(space)? sequence. The p is actually a flag to the substitution command that tells sed to print out the result of the substitution. It's needed because of the -n option.

这篇关于如何使用python,perl或sed从头文件中提取注释?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆