字符串中的注释的正则表达式,注释中的字符串等 [英] Regex for comments in strings, strings in comments, etc

查看:326
本文介绍了字符串中的注释的正则表达式,注释中的字符串等的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个我已经解决并想以Q& A风格发布的问题,因为我认为更多的人可以使用解决方案。



你想在文本中使用引用的字符串和/或注释做一些事情。你想提取他们,突出他们,你有什么。但是一些引用的字符串在注释内,有时注释字符在字符串内。字符串分隔符可以转义,注释可以是行注释或块注释。当你认为你有一个解决方案有人抱怨,当它有一个正则表达式他的JavaScript不工作。什么?



具体示例

  var ret = row.match(/'([^'] +)'/ i); //获取第一个单引号字符串的内容
if(!ret)return''; / * return如果没有匹配
否则转为xml:* /
var message ='\t<'+ ret [1] .replace(/ \ [1] / g, ).replace(/ \ / @(\ w +)/ i,'$ 1 =')+'>< /'+ ret [1] .match(/ [A-Z_] \w * / i)[0] +'

alert('xml:\''+ message +'\''); / *
alert(xml:'+ message +'); // * /

var line = prompt('line-comments start?(eg //)','//');

//使用行
执行操作

这段代码是废话,我在上述JavaScript的每一种情况下都做正确的事情?



我发现的唯一的东西就是这样:



此代码抓取4种类型的块,可以包含另一个3.你可以迭代通过这个,并做每一个任何你想要的或丢弃它,因为它不是一个你想做任何事情。



这是一个特定的JavaScript,因为它是我熟悉的语言,但你可以很容易地适应您的偏好的语言。



em>我已经收到通知,一般模式在这里描述得很好: http://stackoverflow.com/a/23589204/2684660 ,neato!


This a question I've solved and wanted to post in Q&A style because I think more people could use the solution. Or maybe improve the solution, show where it breaks.

The problem

You wanna do something with quoted strings and/or comments in a body of text. You wanna extract them, highlight them, what have you. But some quoted strings are inside comments, and sometimes comment-characters are inside strings. And strings delimiters can be escaped, and comments can be line-comments or block comments. And when you thought you had a solution somebody complains that it doesn't work when there's a regex-literal in his JavaScript. What do?

Concrete example

var ret = row.match(/'([^']+)'/i); // Get 1st single quoted string's content
if (!ret) return ''; /* return if there's no matches 
                        Otherwise turn into xml: */
var message = '\t<' + ret[1].replace(/\[1]/g, '').replace(/\/@(\w+)/i, ' $1=""') + '></' + ret[1].match(/[A-Z_]\w*/i)[0] + '>';

alert('xml: \'' + message + '\''); /*
alert("xml: '" + message + "'"); // */

var line = prompt('How do line-comments start? (e.g. //)', '//');

// do something with line

This code is nonsense, but how do I do the right thing in each of the cases of the above JavaScript?

The only thing I found that comes close is this: Comments in string and strings in comments where Jan Goyvaerts himself answered with a similar approach. But that one doesn't handle apostrophe-escaping yet.

解决方案

I've broken the regex into 4 lines corresponding with the 4 paths in the graph, don't keep those line-breaks in there if you ever use this.

(['"])(?:(?!\1|\\).|\\.)*\1|
\/(?![*/])(?:[^\\/]|\\.)+\/[igm]*|
\/\/[^\n]*(?:\n|$)|
\/\*(?:[^*]|\*(?!\/))*\*\/

Debuggex Demo

This code grabs 4 types of "blocks" that can contain the other 3. You can iterate through this and do with each one whatever you want or discard it because it's not the one you wanna do anything to.

This one is specific for JavaScript as it's a language I'm familiar with. But you could easily adapt this to the language of your preference.

Anyone see a way in which this code breaks?

Edit I have since been notified that the general pattern is described very well here: http://stackoverflow.com/a/23589204/2684660, neato!

这篇关于字符串中的注释的正则表达式,注释中的字符串等的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆