Perl替换嵌套块正则表达式 [英] Perl replace nested blocks regular expression

查看:65
本文介绍了Perl替换嵌套块正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要获取哈希数组或哈希树中的嵌套块,以便能够用动态内容替换这些块.我需要替换

I need to get the nested blocks in hash array or hash tree to be able to substitute the blocks with dynamic contents. I need to replace the code between

<!--block:XXX-->

和第一个封闭端块

<!--endblock--> 

包含我的动态内容.

我有这段代码可以找到一级注释块,但不能嵌套:

I have this code that finds one level comments blocks but not nested:

#<!--block:listing-->... html code block here ...<!--endblock-->
$blocks{$1} = $2 while $content =~ /<!--block:(.*?)-->((?:(?:(?!<!--(.*?)-->).)|(?R))*?)<!--endblock-->/igs;

这是我要处理的完整的嵌套html模板.因此,我需要找到并替换内部块"block:third",并用我的内容替换它,然后找到"block:second"并替换它,然后找到外部块"block:first"并将其替换.请注意,可以有任意数量的嵌套块,而不仅仅是下面的示例中的三个,还可以是几个嵌套块.

Here is the complete nested html template that I want to process. So I need to find and replace the inner block "block:third" and replace it with my content , then find "block:second" and replace it then find the outer block "block:first" and replace it. Please note that, there can be any number of nested blocks and not just three like the example below, it could be several nested blocks.

use Data::Dumper;

$content=<<HTML;
some html content here

<!--block:first-->
    some html content here

    <!--block:second-->
        some html content here

        <!--block:third-->
            some html content here
        <!--endblock-->

        some html content here
    <!--endblock-->

    some html content here
<!--endblock-->
HTML

$blocks{$1} = $2 while $content =~ /<!--block:(.*?)-->((?:(?:(?!<!--(.*?)-->).)|(?R))*?)<!--endblock-->/igs;
print Dumper(%blocks);

所以我可以访问和修改$block{first} = "my content here"$block{second} = "another content here"等模块,然后替换这些模块.

So I can access and modify the blocks like $block{first} = "my content here" and $block{second} = "another content here" etc then replace the blocks.

我创建了此 regex

推荐答案

我要添加一个附加答案.这与我先前的答案相符,但略有更多
完成,我不想再弄混这个答案了.

I'm gonna add an additional answer. It's in line with my previous answer, but slightly more
complete and I don't want to muddy up that answer any more.

这是针对@daliaessam的,是对@Miller轶事在递归解析中的一种具体回应.
使用正则表达式.

This is for @daliaessam and kind of a specific response to @Miller anecdote's on recursive parsing
using regular expressions.

只有3个部分要考虑.因此,根据我以前的表现,我向大家展示了一个
有关如何执行此操作的模板.它不像您想的那么难.

There is only 3 parts to consider. So, using my previous manifestation, I lay out to you guys a
template on how to do this. Its not as hard as you think.

干杯!

 # //////////////////////////////////////////////////////
 # // The General Guide to 3-Part Recursive Parsing
 # // ----------------------------------------------
 # // Part 1. CONTENT
 # // Part 2. CORE
 # // Part 3. ERRORS

 (?is)

 (?:
      (                                  # (1), Take off CONTENT
           (?&content) 
      )
   |                                   # OR
      (?>                                # Start-Delimiter (in this case, must be atomic because of .*?)
           <!--block:
           ( .*? )                            # (2), Block name
           -->
      )
      (                                  # (3), Take off The CORE
           (?&core) 
        |  
      )
      <!--endblock-->                    # End-Delimiter

   |                                   # OR
      (                                  # (4), Take off Unbalanced (delimeter) ERRORS
           <!--
           (?: block: .*? | endblock )
           -->
      )
 )

 # ///////////////////////
 # // Subroutines
 # // ---------------

 (?(DEFINE)

      # core
      (?<core>
           (?>
                (?&content) 
             |  
                (?> <!--block: .*? --> )
                # recurse core
                (?:
                     (?&core) 
                  |  
                )
                <!--endblock-->
           )+
      )

      # content 
      (?<content>
           (?>
                (?!
                     <!--
                     (?: block: .*? | endblock )
                     -->
                )
                . 
           )+
      )

 )

Perl代码:

use strict;
use warnings;

use Data::Dumper;

$/ = undef;
my $content = <DATA>;

# Set the error mode on/off here ..
my $BailOnError = 1;
my $IsError = 0;

my $href = {};

ParseCore( $href, $content );

#print Dumper($href);

print "\n\n";
print "\nBase======================\n";
print $href->{content};
print "\nFirst======================\n";
print $href->{first}->{content};
print "\nSecond======================\n";
print $href->{first}->{second}->{content};
print "\nThird======================\n";
print $href->{first}->{second}->{third}->{content};
print "\nFourth======================\n";
print $href->{first}->{second}->{third}->{fourth}->{content};
print "\nFifth======================\n";
print $href->{first}->{second}->{third}->{fourth}->{fifth}->{content};
print "\nSix======================\n";
print $href->{six}->{content};
print "\nSeven======================\n";
print $href->{six}->{seven}->{content};
print "\nEight======================\n";
print $href->{six}->{seven}->{eight}->{content};

exit;


sub ParseCore
{
    my ($aref, $core) = @_;
    my ($k, $v);
    while ( $core =~ /(?is)(?:((?&content))|(?><!--block:(.*?)-->)((?&core)|)<!--endblock-->|(<!--(?:block:.*?|endblock)-->))(?(DEFINE)(?<core>(?>(?&content)|(?><!--block:.*?-->)(?:(?&core)|)<!--endblock-->)+)(?<content>(?>(?!<!--(?:block:.*?|endblock)-->).)+))/g )
    {
       if (defined $1)
       {
         # CONTENT
           $aref->{content} .= $1;
       }
       elsif (defined $2)
       {
         # CORE
           $k = $2; $v = $3;
           $aref->{$k} = {};
 #         $aref->{$k}->{content} = $v;
 #         $aref->{$k}->{match} = $&;

           my $curraref = $aref->{$k};
           my $ret = ParseCore($aref->{$k}, $v);
           if ( $BailOnError && $IsError ) {
               last;
           }
           if (defined $ret) {
               $curraref->{'#next'} = $ret;
           }
       }
       else
       {
         # ERRORS
           print "Unbalanced '$4' at position = ", $-[0];
           $IsError = 1;

           # Decide to continue here ..
           # If BailOnError is set, just unwind recursion. 
           # -------------------------------------------------
           if ( $BailOnError ) {
              last;
           }
       }
    }
    return $k;
}

#================================================
__DATA__
some html content here top base
<!--block:first-->
    <table border="1" style="color:red;">
    <tr class="lines">
        <td align="left" valign="<--valign-->">
    <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
    <!--hello--> <--again--><!--world-->
    some html content here 1 top
    <!--block:second-->
        some html content here 2 top
        <!--block:third-->
            some html content here 3 top
            <!--block:fourth-->
                some html content here 4 top
                <!--block:fifth-->
                    some html content here 5a
                    some html content here 5b
                <!--endblock-->
            <!--endblock-->
            some html content here 3a
            some html content here 3b
        <!--endblock-->
        some html content here 2 bottom
    <!--endblock-->
    some html content here 1 bottom
<!--endblock-->
some html content here1-5 bottom base

some html content here 6-8 top base
<!--block:six-->
    some html content here 6 top
    <!--block:seven-->
        some html content here 7 top
        <!--block:eight-->
            some html content here 8a
            some html content here 8b
        <!--endblock-->
        some html content here 7 bottom
    <!--endblock-->
    some html content here 6 bottom
<!--endblock-->
some html content here 6-8 bottom base

输出>>

Base======================
some html content here top base

some html content here1-5 bottom base

some html content here 6-8 top base

some html content here 6-8 bottom base

First======================

    <table border="1" style="color:red;">
    <tr class="lines">
        <td align="left" valign="<--valign-->">
    <b>bold</b><a href="http://www.mewsoft.com">mewsoft</a>
    <!--hello--> <--again--><!--world-->
    some html content here 1 top

    some html content here 1 bottom

Second======================

        some html content here 2 top

        some html content here 2 bottom

Third======================

            some html content here 3 top

            some html content here 3a
            some html content here 3b

Fourth======================

                some html content here 4 top


Fifth======================

                    some html content here 5a
                    some html content here 5b

Six======================

    some html content here 6 top

    some html content here 6 bottom

Seven======================

        some html content here 7 top

        some html content here 7 bottom

Eight======================

            some html content here 8a
            some html content here 8b

这篇关于Perl替换嵌套块正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆