使用Perl中的替换运算符跳过字符串中的特定位置 [英] Skipping particular positions in a string using substitution operator in perl

查看:309
本文介绍了使用Perl中的替换运算符跳过字符串中的特定位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

昨天,我陷入了Perl脚本的困境.让我简化一下,假设有一个字符串(例如ABCDEABCDEABCDEPABCDEABCDEPABCDEABCD),首先我必须在"E"出现的每个位置都将其断开,其次,要专门在用户想要的位置断开它.但是,条件是,程序不应在E后面跟着P的那些位置剪切.例如,此序列中有6个E,因此一个应该得到7个片段,但是当2个E后面跟着P时,一个只能得到5个片段输出中的片段.

Yesterday, I got stuck in a perl script. Let me simplify it, suppose there is a string (say ABCDEABCDEABCDEPABCDEABCDEPABCDEABCD), first I've to break it at every position where "E" comes, and secondly, break it specifically where the user wants to be at. But, the condition is, program should not cut at those sites where E is followed by P. For example there are 6 Es in this sequence, so one should get 7 fragments, but as 2 Es are followed by P one will get 5 only fragments in the output.

关于第二种情况,我需要帮助.假设用户不想在序列的E的第5位和第10位处剪切此序列,那么仅允许程序跳过这两个位置的相应脚本应该是什么?我的第一种情况的脚本是:

I need help regarding the second case. Suppose user doesn't wants to cut this sequence at, say 5th and 10th positions of E in the sequence, then what should be the corresponding script to let program skip these two sites only? My script for first case is:

my $otext = 'ABCDEABCDEABCDEPABCDEABCDEPABCDEABCD';

$otext=~ s/([E])/$1=/g; #Main cut rule.

$otext=~ s/=P/P/g;

@output = split( /\=/, $otext);

print "@output";

请帮忙!

推荐答案

要分割"E"(除其后跟"P"的位置),您应使用负超前断言.

To split on "E" except where it's followed by "P", you should use Negative look-ahead assertions.

来自 perldoc perlre 环顾断言"部分:

From perldoc perlre "Look-Around Assertions" section:

  • (?! pattern)
    零宽度的否定超前断言.
    例如,/foo(?!bar)/匹配任何出现的"foo",但后没有"bar".
  • (?!pattern)
    A zero-width negative look-ahead assertion.
    For example /foo(?!bar)/ matches any occurrence of "foo" that isn't followed by "bar".

my $otext = 'ABCDEABCDEABCDEPABCDEABCDEPABCDEABCD'; 
#                E    E    EP    E    EP    E
my @output=split(/E(?!P)/, $otext); 
use Data::Dumper; print Data::Dumper->Dump([\@output]);"

$VAR1 = [
          'ABCD',
          'ABCD',
          'ABCDEPABCD',
          'ABCDEPABCD',
          'ABCD'
        ];


现在,为了不发生#2和#4事故,您可以做两件事:


Now, in order to NOT cut at occurences #2 and #4, you can do 2 things:

  1. 构造一个真正花哨的正则表达式,该表达式在给定的情况下自动失败.为了完整起见,我会将其留给其他人尝试答案.

  1. Concoct a really fancy regex that automatically fails to match on given occurence. I will leave that to someone else to attempt in an answer for completeness sake.

简单地将正确的片段缝合在一起.

Simply stitch together the correct fragments.

我实在太死了,无法提出一种很好的惯用方式,但是简单又肮脏的方式是:

I'm too brain-dead to come up with a good idiomatic way of doing it, but the simple and dirty way is either:

  my %no_cuts = map { ($_=>1) } (2,4); # Do not cut in positions 2,4
  my @output_final;
  for(my $i=0; $i < @output; $i++) {
      if ($no_cuts{$i}) {
          $output_final[-1] .= $output[$i];
      } else {
          push @output_final, $output[$i];
      } 
  }
  print Data::Dumper->Dump([\@output_final];

  $VAR1 = [
            'ABCD',
            'ABCDABCDEPABCD',
            'ABCDEPABCDABCD'
          ];

或更简单:

  my %no_cuts = map { ($_=>1) } (2,4); # Do not cut in positions 2,4
  for(my $i=0; $i < @output; $i++) {
      $output[$i-1] .= $output[$i]; 
      $output[$i]=undef; # Make the slot empty
  }
  my @output_final = grep {$_} @output; # Skip empty slots
  print Data::Dumper->Dump([\@output_final];

  $VAR1 = [
            'ABCD',
            'ABCDABCDEPABCD',
            'ABCDEPABCDABCD'
          ];

这篇关于使用Perl中的替换运算符跳过字符串中的特定位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆