python正则表达式删除注释 [英] python regex to remove comments

查看:100
本文介绍了python正则表达式删除注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将如何编写一个正则表达式来删除所有以 # 开头并在行尾停止的注释——但同时排除前两行

How would I write a regex that removes all comments that start with the # and stop at the end of the line -- but at the same time exclude the first two lines which say

#!/usr/bin/python 

#-*- coding: utf-8 -*-

推荐答案

您可以通过使用 tokenize.generate_tokens 解析 Python 代码来删除注释.以下是文档中的这个示例的略微修改版本:

You can remove comments by parsing the Python code with tokenize.generate_tokens. The following is a slightly modified version of this example from the docs:

import tokenize
import io
import sys
if sys.version_info[0] == 3:
    StringIO = io.StringIO
else:
    StringIO = io.BytesIO

def nocomment(s):
    result = []
    g = tokenize.generate_tokens(StringIO(s).readline)  
    for toknum, tokval, _, _, _  in g:
        # print(toknum,tokval)
        if toknum != tokenize.COMMENT:
            result.append((toknum, tokval))
    return tokenize.untokenize(result)

with open('script.py','r') as f:
    content=f.read()

print(nocomment(content))

例如:

如果 script.py 包含

If script.py contains

def foo(): # Remove this comment
    ''' But do not remove this #1 docstring 
    '''
    # Another comment
    pass

那么nocomment的输出就是

def foo ():
    ''' But do not remove this #1 docstring 
    '''

    pass 

这篇关于python正则表达式删除注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆