Java - 正则表达式在代码中查找注释 [英] Java - regular expression finding comments in code

查看:36
本文介绍了Java - 正则表达式在代码中查找注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这次使用 Java 有点乐趣.我想编写一个从标准输入中读取代码的程序(例如,逐行),例如:

A little fun with Java this time. I want to write a program that reads a code from standard input (line by line, for example), like:

// some comment
class Main {
    /* blah */
    // /* foo
    foo();
    // foo */
    foo2();
    /* // foo2 */
}

查找其中的所有评论并删除它们.我正在尝试使用正则表达式,现在我已经做了这样的事情:

finds all comments in it and removes them. I'm trying to use regular expressions, and for now I've done something like this:

private static String ParseCode(String pCode)
{
    String MyCommentsRegex = "(?://.*)|(/\*(?:.|[\n\r])*?\*/)";
    return pCode.replaceAll(MyCommentsRegex, " ");
}

但它似乎不适用于所有情况,例如:

but it seems not to work for all the cases, e.g.:

System.out.print("We can use /* comments */ inside a string of course, but it shouldn't start a comment");

任何与正则表达式不同的建议或想法?提前致谢.

Any advice or ideas different from regex? Thanks in advance.

推荐答案

你现在可能已经放弃了,但我对这个问题很感兴趣.

You may have already given up on this by now but I was intrigued by the problem.

我相信这是一个部分解决方案...

I believe this is a partial solution...

本地正则表达式:

//.*|("(?:\[^"]|\"|.)*?")|(?s)/*.*?*/

在 Java 中:

String clean = original.replaceAll( "//.*|("(?:\\[^"]|\\"|.)*?")|(?s)/\*.*?\*/", "$1 " );

这似乎可以正确处理字符串中嵌入的注释以及字符串中正确转义的引号.我向它扔了一些东西来检查,但不是详尽无遗.

This appears to properly handle comments embedded in strings as well as properly escaped quotes inside strings. I threw a few things at it to check but not exhaustively.

有一种折衷方案,即代码中的所有 "" 块在它们之后都会有空格.鉴于需要干净地处理,保持这个简单并解决该问题将非常困难:

There is one compromise in that all "" blocks in the code will end up with space after them. Keeping this simple and solving that problem would be very difficult given the need to cleanly handle:

int/* some comment */foo = 5;

一个简单的 Matcher.find/appendReplacement 循环可以在用空格替换之前有条件地检查 group(1),并且只需要几行代码.也许比完整的解析器更简单.(如果有人感兴趣,我也可以添加匹配器循环.)

A simple Matcher.find/appendReplacement loop could conditionally check for group(1) before replacing with a space and would only be a handful of lines of code. Still simpler than a full up parser maybe. (I could add the matcher loop too if anyone is interested.)

这篇关于Java - 正则表达式在代码中查找注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆