如何通过正则表达式删除尾随注释? [英] How to remove trailing comments via regexp?

查看:148
本文介绍了如何通过正则表达式删除尾随注释?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于不懂MATLAB的读者:不确定他们属于哪个家族,但是).更复杂的是,矩阵转置运算符也是 撇号(A'(Hermitian)或A.'(常规)).

For non-MATLAB-savvy readers: not sure what family they belong to, but the MATLAB regexes are described here in full detail. MATLAB's comment character is % (percent) and its string delimiter is ' (apostrophe). A string delimiter inside a string is written as a double-apostophe ('this is how you write "it''s" in a string.'). To complicate matters more, the matrix transpose operators are also apostrophes (A' (Hermitian) or A.' (regular)).

现在,由于黑暗的原因(我将详细说明:),我试图用MATLAB自己的语言解释MATLAB代码.

Now, for dark reasons (that I will not elaborate on :), I'm trying to interpret MATLAB code in MATLAB's own language.

当前,我正在尝试删除字符串单元格数组中的所有结尾注释,每个字符串包含一行MATLAB代码.乍一看,这似乎很简单:

Currently I'm trying to remove all trailing comments in a cell-array of strings, each containing a line of MATLAB code. At first glance, this might seem simple:

>> str = 'simpleCommand(); % simple trailing comment';
>> regexprep(str, '%.*$', '')
ans =
    simpleCommand(); 

但是,当然可能会出现类似的情况:

But of course, something like this might come along:

>> str = ' fprintf(''%d%*c%3.0f\n'', value, args{:}); % Let''s do this! ';
>> regexprep(str, '%.*$', '') 
ans = 
    fprintf('        %//   <-- WRONG!

很显然,我们需要从匹配中排除字符串中所有的注释字符,同时还要考虑到紧跟一条语句的单个撇号(或点撇号)是 operator ,而不是字符串定界符.

Obviously, we need to exclude all comment characters that reside inside strings from the match, while also taking into account that a single apostrophe (or a dot-aposrotphe) directly following a statement is an operator, not a string delimiter.

基于这样的假设:注释前的字符 之前的字符串开头/结尾字符的数量必须为 even (由于矩阵转置,我知道这是不完整的运算符),我想出了以下动态正则表达式来处理这种情况:

Based on the assumption that the amount of string opening/closing characters before the comment character must be even (which I know is incomplete, because of the matrix-transpose operator), I conjured up the following dynamic regex to handle this sort of case:

>> str = {
       'myFun( {''test'' ''%''}); % let''s '                 
       'sprintf(str, ''%*8.0f%*s%c%3d\n''); % it''s '        
       'sprintf(str, ''%*8.0f%*s%c%3d\n''); % let''s '       
       'sprintf(str, ''%*8.0f%*s%c%3d\n'');  '
       'A = A.'';%tight trailing comment'
   };
>> 
>> C = regexprep(str, '(^.*)(?@mod(sum(\1==''''''''),2)==0;)(%.*$)', '$1')

但是

C = 
    'myFun( {'test' '%'}); '              %// sucess
    'sprintf(str, '%*8.0f%*s%c%3d\n'); '  %// sucess
    'sprintf(str, '%*8.0f%*s%c%3d\n'); '  %// sucess
    'sprintf(str, '%*8.0f%*s%c'           %// FAIL
    'A = A.';'                            %// success (although I'm not sure why)

所以我几乎在这里,但还不是很:em

so I'm almost there, but not quite yet :)

不幸的是,我已经花了很多时间思考这个问题,需要继续做其他事情,所以也许其他人有更多的时间很友善地思考这些问题:

Unfortunately I've exhausted the amount of time I can spend thinking about this and need to continue with other things, so perhaps someone else who has more time is friendly enough to think about these questions:

  1. 字符串中的注释字符是否是我需要关注的 only 例外?
  2. 正确和/或更有效的方法是什么?
  1. Are comment characters inside strings the only exception I need to look out for?
  2. What is the correct and/or more efficient way to do this?

推荐答案

这通过检查在一个字符之前允许哪些字符来匹配共轭转置大小写

This matches conjugate transpose case by checking what characters are allowed before one

  1. 数字2'
  2. 字母A'
  3. A.'
  4. 左括号,括号和括号A(1)'A{1}'[1 2 3]'
  1. Numbers 2'
  2. Letters A'
  3. Dot A.'
  4. Left parenthesis, brace and bracket A(1)', A{1}' and [1 2 3]'

这些是我现在唯一想到的情况.

These are the only cases I can think of now.

C = regexprep(str, '^(([^'']*''[^'']*''|[^'']*[\.a-zA-Z0-9\)\}\]]''[^'']*)*[^'']*)%.*$', '$1')

在您的示例中,我们返回了

on your example we it returns

>> C = regexprep(str, '^(([^'']*''[^'']*''|[^'']*[\.a-zA-Z0-9\)\}\]]''[^'']*)*[^'']*)%.*$', '$1')

C = 

    'myFun( {'test' '%'}); '
    'sprintf(str, '%*8.0f%*s%c%3d\n'); '
    'sprintf(str, '%*8.0f%*s%c%3d\n'); '
    'sprintf(str, '%*8.0f%*s%c%3d\n');  '
    'A = A.';'

这篇关于如何通过正则表达式删除尾随注释?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆