Tcl regexp 不转义星号 (*) [英] Tcl regexp does not escape asterisk (*)

查看:39
本文介绍了Tcl regexp 不转义星号 (*)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的脚本中,我得到一个如下所示的字符串:

In my script I get a string that looks like this:

Reading thisfile.txt
"lib" maps to directory somedir/work.
"superlib" maps to directory somedir/work.
"anotherlib" maps to directory somedir/anotherlib.
** Error: (errorcode) Cannot access file "somedir/anotherlib". <--
No such file or directory. (errno = ENOENT)                    <--  
Reading anotherfile.txt
.....

但是带有错误代码的两条标记行只是偶尔出现.我正在尝试使用正则表达式将 Reading thisfile.txt 之后的行获取到 Reading anotherfile.txt 之前的行,或者,如果存在,则在 <代码>**.

But the two marked lines with the error code only appear from time to time. I'm trying to use a regexpression to get the lines from after Reading thisfile.txt to the line before either Reading anotherfile.txt or, if it is there, before **.

所以 result 在任何情况下都应该是这样的:

So result should in every case look like this:

"lib" maps to directory somedir/work.
"superlib" maps to directory somedir/work.
"anotherlib" maps to directory somedir/anotherlib.

我已经用这个正则表达式试过了:

I have tried it with this regexp:

set pattern ".*Reading thisfile.txt\n(.*)\n.*Reading .*$"

那我做

regexp -all $pattern $data -> result

但这只有在没有错误消息时才有效.所以我试图寻找 *.

But that only works if there is no error message. So I'm trying to look for the *.

set pattern ".*Reading thisfile.txt\n(.*)\n.*\[\*|Reading\].*$"

但这也行不通.** Error 的部分还在.

But that also does not work. The part with ** Error is still there.

我想知道为什么.这个甚至不能编译:

I wonder why. This one doesn't even compile:

set pattern ".*Reading thisfile.txt\n(.*)\n.*\*?.*Reading .*$"

知道如何找到和不匹配的 * 吗?

any idea how I can find the and not match the *?

推荐答案

从你编写正则表达式的方式来看,你必须使用大括号:

From the way you wrote your regex, you will have to use braces:

set pattern {.*Reading thisfile\.txt\n(.*)\n.*\*?.*Reading .*$}

如果您使用引号,则必须使用:

If you used quotes, you would have had to use:

set pattern ".*Reading thisfile\\.txt\n(.*)\n.*\\*?.*Reading .*$"

即基本上放第二个反斜杠来逃避第一个反斜杠.

i.e. basically put a second backslash to escape the first ones.

上面就能抢到东西;尽管第一个和最后一个阅读之间的所有内容.

The above will be able to grab something; albeit everything between the first and the last Reading.

如果你想从 Reading thisfile.txt 匹配到以星号开头的下一行,那么你可以使用:

If you want to match from Reading thisfile.txt to the next line beginning with asterisk, then you could use:

set pattern {^Reading thisfile\.txt\n(.*?)\n(?=^Reading|^\*)}
regexp -all -lineanchor -- $pattern $data -> result

(?=^Reading|^\*) 是一个积极的前瞻,我把你的 (.*) 改为 (.*?)code> 以便您真正获得所有出现次数,而不是从第一个到最后一个阅读.

(?=^Reading|^\*) is a positive lookahead and I changed your (.*) to (.*?) so that you really get all the occurrences and not from the first to the last Reading.

如果 Reading* 在前面并且都从新行开始,则正向前瞻将匹配.

The positive lookahead will match if either Reading or * is ahead and are both starting on a new line.

-lineanchor 使 ^ 在每一行的开头匹配,而不是在字符串的开头.

-lineanchor makes ^ match at every beginning of line instead of at the start of the string.

键盘演示

我忘了提及,如果您有多个匹配项,则必须set 正则表达式的结果并使用 -inline 修饰符而不是使用上面的构造(否则你只会得到最后一个子匹配)...

I forgot to mention that if you have more than one match, you will have to set the results of the regexp and use the -inline modifier instead of using the above construct (else you'll get only the last submatch)...

set results [regexp -all -inline -lineanchor -- $pattern $data]
foreach {main sub} $results {
  puts $sub
}

这篇关于Tcl regexp 不转义星号 (*)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆