正则表达式匹配PHP注释 [英] Regex Match PHP Comment

查看:133
本文介绍了正则表达式匹配PHP注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试使用regex来匹配PHP评论。

  //([^<] +)\r \\\

这是什么东西,但它不工作。



Ive也尝试过

  //([^<] +)\r 
//([^<] +)\\\

//([^<] +)

...无效

解决方案

要匹配评论, PHP 5中的两种类型的注释: //




  • 转到 / * 开头的行末

  • 并转到 * /



考虑您先有这两行:

  $ filePath ='/home/squale/developpement/astralblog/website/library/HTMLPurifier.php'; 
$ str = file_get_contents($ filePath);

您可以匹配第一个:

  $ matches_slashslash = array(); 
if(preg_match_all('#//(。*)$#m',$ str,$ matches_slashslash)){
var_dump($ matches_slashslash [1]);
}

第二个与:

  $ matches_slashstar = array(); 
if(preg_match_all('#/ \ *(。*?)\ * /#sm',$ str,$ matches_slashstar)){
var_dump($ matches_slashstar [1]);
}

但你可能会遇到麻烦的' / ,或者切换注释,例如:这:

  / * 
echo'a';
/ * /
echo'b';
// * /

(只需在开头添加一个斜杠切换两个块,如果你不知道这个技巧)



所以...很难检测评论只有regex ...






另一种方法是使用 PHP Tokenizer ,这显然知道如何解析





这样,您必须在PHP代码字符串上使用tokenizer



这样可能会:

  $ tokens = token_get_all($ str); 

foreach($ tokens as $ token){
if($ token [0] == T_COMMENT
|| $ token [0] == T_DOC_COMMENT){
//这是一个注释;-)
var_dump($ token);
}
}

这样的东西:

 数组
0 => int 366
1 => string'/ ** HTML Purifier * /'版本(长度= 31)
2 => int 57

或:

  array 
0 => int 365
1 => string'//:TODO:make the config merge in,instead of replace
'(length = 55)
2 => int 117

(just可能会删除 / / * * / ,但这取决于你;至少,你已经提取了注释^^) p>

如果你真的想要检测注释没有任何奇怪的错误,由于奇怪的语法,我想这将是一种方式; - )


Ive been trying to match PHP comments using regex.

//([^<]+)\r\n

Thats what ive got but it doesn't really work.

Ive also tried

//([^<]+)\r
//([^<]+)\n
//([^<]+)

...to no avail

解决方案

To match comments, you have to think there are two types of comments in PHP 5 :

  • comments which start by // and go to the end of the line
  • comments that start by /* and go to */

Considering you have these two lines first :

$filePath = '/home/squale/developpement/astralblog/website/library/HTMLPurifier.php';
$str = file_get_contents($filePath);

You could match the first ones with :

$matches_slashslash = array();
if (preg_match_all('#//(.*)$#m', $str, $matches_slashslash)) {
    var_dump($matches_slashslash[1]);
}

And the second ones with :

$matches_slashstar = array();
if (preg_match_all('#/\*(.*?)\*/#sm', $str, $matches_slashstar)) {
    var_dump($matches_slashstar[1]);
}

But you will probably get into troubles with '//' in the middle of string (what about heredoc syntax, btw, did you think about that one ? ), or "toggle comments" like this :

/*
echo 'a';
/*/
echo 'b';
//*/

(Just add a slash at the begining to "toggle" the two blocks, if you don't know the trick)

So... Quite hard to detect comments with only regex...


Another way would be to use the PHP Tokenizer, which, obviously, knows how to parse PHP code and comments.

For references, see :

With that, you would have to use the tokenizer on your string of PHP code, iterate on all the tokens you get as a result, and detect which ones are comments.

Something like this would probably do :

$tokens = token_get_all($str);

foreach ($tokens as $token) {
    if ($token[0] == T_COMMENT
        || $token[0] == T_DOC_COMMENT) {
        // This is a comment ;-)
        var_dump($token);
    }
}

And, as output, you'll get a list of stuff like this :

array
  0 => int 366
  1 => string '/** Version of HTML Purifier */' (length=31)
  2 => int 57

or this :

array
  0 => int 365
  1 => string '// :TODO: make the config merge in, instead of replace
' (length=55)
  2 => int 117

(You "just" might to strip the // and /* */, but that's up to you ; at least, you have extracted the comments ^^ )

If you really want to detect comments without any kind of strange error due to "strange" syntax, I suppose this would be the way to go ;-)

这篇关于正则表达式匹配PHP注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆