如何使用后缀匹配单引号或双引号? [英] How can I use a look after to match either a single or a double quote?

查看:261
本文介绍了如何使用后缀匹配单引号或双引号?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要提取一系列字符串:

I have a series of strings I want to extract:

hello.this_is("bla bla bla")
some random text
hello.this_is('hello hello')
other stuff

我需要获取的内容(从许多文件中获取,但这在这里并不重要)是hello.this_is()之间的内容,所以我想要的输出是:

What I need to get (from many files, but this is not important here) is the content between hello.this_is( and ), so my desired output is:

bla bla bla
hello hello

如您所见,括号内的文本可以用双引号或单引号引起来.

As you see, the text within parentheses can be enclosed with either double or single quotes.

如果这只是单引号,我会像这样:

If this was only single quotes I would use a look behind and look ahead just like this:

grep -Po "(?<=hello.this_is\(').*(?=')" file
#                            ^      ^
# returns ---> hello hello

类似地,要从双引号中获取字符串,我会说:

Similarly, to get strings from double quotes I would say:

grep -Po '(?<=hello.this_is\(").*(?=")' file
#                            ^      ^
# returns ---> bla bla bla

但是,我想同时匹配这两种情况,因此它会得到单引号和双引号.我尝试使用$''进行转义,但无法使其起作用:

However, I want to match both cases, so it gets both single and double quotes. I tried with using $'' to escape, but could not make it work:

grep -Po '(?<=hello.this_is\($'["\']').*(?=$'["\']')' file
#                            ^^^^^^^^      ^^^^^^^^

我当然可以使用ASCII码说:

I can of course use the ASCII number and say:

grep -Po '(?<=hello.this_is\([\047\042]).*' file

但是我想使用引号和单引号,因为047042对我来说不像单引号和双引号那样具有代表性.

but I would like to use the quotes and single quotes, since 047 and 042 are not that much representative to me as single and double quotes are.

推荐答案

注意:此答案底部的sed命令仅在您的字符串表现得很好时才有效

Note: The sed command at the bottom of this answer works only as long as your strings are nice behaving strings like

"foo"

'bar'

一旦您的字符串开始出现异常:)就像:

As soon as your strings start to misbehave :) like:

"hello \"world\""

它将不再起作用.

您的输入看起来像源代码.为了获得稳定的解决方案,我建议使用该语言的解析器来提取字符串.

Your input looks like source code. For a stable solution I recommend to use a parser for that language to extract the strings.

对于普通用例:

您可以使用sed.与grep -oP仅适用于GNU grep的grep -oP相比,该解决方案应该可以在任何POSIX平台上运行:

You can use sed. The solution is supposed to work on any POSIX platform in contrast to grep -oP which only works with GNU grep:

sed -n 's/hello\.this_is(\(["'\'']\)\([^"]*\)\(["'\'']\).*/\2/gp' file
#                                    ^^^^^^^^              ^^
#                                          capture group 2 ^

这篇关于如何使用后缀匹配单引号或双引号?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆