查找被引号包围的单词 perl [英] finding words surround by quotations perl

查看:24
本文介绍了查找被引号包围的单词 perl的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在逐行阅读另一个 perl 文件,需要找到任何被单引号或双引号包围的单词或单词集.这是我正在阅读的代码示例:

I am reading another perl file line by line and need to find any words or set of words surround by single or double quotations. This is an example of the code I am reading in:

#!/usr/bin/env perl
use strict;
use warnings;

my $string = 'Hello World!';
print "$string\n"; 

基本上,我需要找到并打印出Hello World!"和$string\n".

Basically, I need to find and print out 'Hello World!' and "$string\n".

我已经仔细阅读了我的文件并将其内容存储在一个数组中.从那里我循环遍历每一行并使用正则表达式在引用中找到所需的单词集:

I've read my file in fine and stored its contents in an array. From there I'm looping over each line and find the desired set of words in the quotations using regex as such:

for(@contents) {
   if(/\"|\'[^\"|\']*\"|\'/) {
       print $_."\n";
   }
}

这给了我以下输出:

my $string = 'Hello World!';
print "$string\n"; 

我尝试按空格分割内容,然后尝试找到匹配项,但这给了我:

I tried splitting the contents by whitespace and then trying to find a match, but that gives me this:

'Hello
World!'
"$string\n";

我已经尝试了许多其他建议的解决方案,但都无济于事.我也尝试过 Text::ParseText 并使用 parse_line,但这给了我完全错误的输出.

I've tried numerous solutions other suggested on here but to no avail. I have also tried Text::ParseText and using parse_line, but that gives me the complete wrong output.

有什么可以帮助我的想法吗?

Any ideas that could help me?

推荐答案

只需要在正则表达式中添加一些捕获括号,而不是打印整行

Just need to add some capturing parenthesis to your regex, instead of printing the whole line

use strict;
use warnings;

while (<DATA>) {
    if(/(["'][^"']*["'])/) {
        print "$1\n";
    }
}

__DATA__
#!/usr/bin/env perl
use strict;
use warnings;

my $string = 'Hello World!';
print "$string\n"; 

请注意,尽管您的正则表达式有很多缺陷.例如 '\'' 不会正确匹配.他说‘嘘’" 也不会.为了更接近,您必须进行一些平衡括号检查,但不会有任何完美的解决方案.

Note, there are plenty of flaws in your regex though. For example '\'' Won't match properly. Neither will "He said 'boo'". To get closer you'll have to do some balanced parenthesis checking, but there isn't going to be any perfect solution.

对于更接近的解决方案,您可以使用以下内容:

For a solution that is a little closer, you could use the following:

if(/('(?:(?>[^'\\]+)|\\.)*'|"(?:(?>[^"\\]+)|\\.)*")/) {

这将处理我上面的异常以及像 print "how about ' this \" 和 ' more \n" 之类的字符串;,但仍然存在诸如使用 qq{}q{}.更不用说跨越多行的字符串了.

That would take care of my above exceptions and also strings like print "how about ' this \" and ' more \n";, but there are still edge cases like the use of qq{} or q{}. Not to mention strings that span more than one line.

换句话说,如果你的目标是完美的,这个项目可能超出了大多数人的技能范围,但希望以上内容能有所帮助.

In other words, if your goal is perfect, this project may be outside of the scope of most people's skills, but hopefully the above will be of some help.

这篇关于查找被引号包围的单词 perl的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆