使用Python删除C和C ++注释? [英] Remove C and C++ comments using Python?

查看:107
本文介绍了使用Python删除C和C ++注释?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找可从字符串中删除C和C ++注释的Python代码。 (假设字符串包含整个C源文件。)

I'm looking for Python code that removes C and C++ comments from a string. (Assume the string contains an entire C source file.)

我意识到我可以使用正则表达式来.match()子字符串,但这不能解决嵌套 / * ,或者在 / * * / // >。

I realize that I could .match() substrings with a Regex, but that doesn't solve nesting /*, or having a // inside a /* */.

理想情况下,我宁愿使用非天真的实现方式来正确处理尴尬的情况。

Ideally, I would prefer a non-naive implementation that properly handles awkward cases.

推荐答案

我不知道您是否熟悉 sed (基于UNIX(但Windows可用)的文本解析程序),但是我在此处找到了一个sed脚本,它将从文件中删除C / C ++注释。非常聪明;例如,它将在字符串声明等中忽略 //和 / *。在Python中,可以使用以下代码来使用它:

I don't know if you're familiar with sed, the UNIX-based (but Windows-available) text parsing program, but I've found a sed script here which will remove C/C++ comments from a file. It's very smart; for example, it will ignore '//' and '/*' if found in a string declaration, etc. From within Python, it can be used using the following code:

import subprocess
from cStringIO import StringIO

input = StringIO(source_code) # source_code is a string with the source code.
output = StringIO()

process = subprocess.Popen(['sed', '/path/to/remccoms3.sed'],
    input=input, output=output)
return_code = process.wait()

stripped_code = output.getvalue()

在此程序中, source_code 是保存C / C ++源代码的变量,最终是 stripped_code 将保留C / C ++代码,并删除注释。当然,如果磁盘上有文件,则可以将 input output 变量作为指向文件的句柄这些文件(在读模式下为 input ,在写模式下为 output )。 remccoms3.sed 是上述链接中的文件,应保存在磁盘上的可读位置。 sed 在Windows上也可用,并且默认安装在大多数GNU / Linux发行版和Mac OS X中。

In this program, source_code is the variable holding the C/C++ source code, and eventually stripped_code will hold C/C++ code with the comments removed. Of course, if you have the file on disk, you could have the input and output variables be file handles pointing to those files (input in read-mode, output in write-mode). remccoms3.sed is the file from the above link, and it should be saved in a readable location on disk. sed is also available on Windows, and comes installed by default on most GNU/Linux distros and Mac OS X.

这可能会比纯Python解决方案更好;无需重新发明轮子。

This will probably be better than a pure Python solution; no need to reinvent the wheel.

这篇关于使用Python删除C和C ++注释?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆