正则表达式在 PHP 中重复捕获组 [英] Regex repeated capturing group in PHP

查看:31
本文介绍了正则表达式在 PHP 中重复捕获组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从一个带有路由的文件中获取信息,因此对于这项工作,我选择了正则表达式,但是我对重复的信息有问题,为了更好地提出问题,我将把我所拥有的和我想要的有:

i'm trying to get information from one file with routes, so for this work i chose regex, but i have the problem with the repeted information, for do a better question i will put what i have, and what i want to have:

所以我有一个文件:

Codes: C - Connected, S - Static, R - RIP, B - BGP,
       O - OSPF IntraArea (IA - InterArea, E - External, N - NSSA)
       A - Aggregate, K - Kernel Remnant, H - Hidden, P - Suppressed,
       U - Unreachable, i - Inactive

O E       0.0.0.0/0           via 10.140, bond1.30, cost 1:10, age 5  
                              via 10.141, bond1.31  
                              via 10.142, bond1.32  
O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  
O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  
O IA      10.138/29       via 10.140, bond1.30, cost 46, age 1029440  
C         10.141/29    is directly connected, bond2.35
C         10.141/29    is directly connected, bond2.35   

我做了这个正则表达式:

And i made this regex:

(S|R|B|O|A|K|H|P|U|i) +(IA|E|N|) +([0-9.]+)\/([0-9]+) +via +([0-9.]+), +([a-zA-Z0-9.]+|), +cost +([0-9]+:|)([0-9]+), +age +[0-9]+ +\n(( +via +([0-9.]+), +([a-zA-Z0-9.]+|) +\n)+|)

我的问题是结尾部分 (( +via +([0-9.]+), +([a-zA-Z0-9.]+|) +\n)+|) 因为这个正则表达式让我得到这个

My problem is with the end part (( +via +([0-9.]+), +([a-zA-Z0-9.]+|) +\n)+|) because this regex get me this

array[0]=>' via 10.141, bond1.31  
           via 10.142, bond1.32';
array[1]=>'10.142';

array[2]=>'bond1.32';

但我想得到

array[0]=>'10.141';

array[1]=>'bond1.31';

array[3]=>'10.142';

array[4]=>'bond1.32';

我在关于正则表达式的页面中测试了正则表达式,其中一个告诉我:

I test the regex in pages about regex and one of them tell me this:

注意:重复的捕获组只会捕获最后一次迭代.在重复组周围放置一个捕获组以捕获所有迭代或使用非捕获组代替,如果你不是对数据感兴趣

Note:A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data

但我真的不知道这是什么意思以及如何解决它.

But i really dont know what is the mean of this and how to fix it.

注意:这是为了获取文件是关于cisco中的路由,带有show ip route

Note: this is for get the file is about routes in cisco with show ip route

更新 1

我将正则表达式更改为

(S|R|B|O|A|K|H|P|U|i) +(IA|E|N|) +([0-9.]+)\/([0-9]+) +via +([0-9.]+), +([a-zA-Z0-9.]+|), +cost +([0-9]+:|)([0-9]+), +age +[0-9]+ +\n(?: +via +([0-9.]+), +([a-zA-Z0-9.]+|) +\n)*

这样我就没有

array[0]=>' via 10.141, bond1.31  
       via 10.142, bond1.32';

但是我没有重复的部分

推荐答案

好的,我把你的正则表达式改成这样:

Ok I've changed your regexp like this:

$txt = "Codes: C - Connected, S - Static, R - RIP, B - BGP,
       O - OSPF IntraArea (IA - InterArea, E - External, N - NSSA)
       A - Aggregate, K - Kernel Remnant, H - Hidden, P - Suppressed,
       U - Unreachable, i - Inactive

O E       0.0.0.0/0           via 10.140, bond1.30, cost 1:10, age 5  
                              via 10.141, bond1.31  
                              via 10.142, bond1.32  
O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  
O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  
O IA      10.138/29       via 10.140, bond1.30, cost 46, age 1029440  
C         10.141/29    is directly connected, bond2.35
C         10.141/29    is directly connected, bond2.35   
";

$regexp = '#(.) +([A-Z]{1,2}) +([\d.]+/\d+) +via ([\d.]+), ([a-zA-Z0-9.]+), cost [\d:]+, age \d+ +(?:\n +via ([\d.]+), ([a-zA-Z0-9.]+))*#m';
$matches = [];
preg_match_all($regexp, $txt, $matches, PREG_SET_ORDER);
var_dump($matches);

这是输出:

array(4) {
  [0] =>
  array(8) {
    [0] =>
    string(125) "O E       0.0.0.0/0           via 10.140, bond1.30, cost 1:10, age 5  
                                  via 10.141, bond1.31"
    [1] =>
    string(1) "O"
    [2] =>
    string(1) "E"
    [3] =>
    string(9) "0.0.0.0/0"
    [4] =>
    string(6) "10.140"
    [5] =>
    string(8) "bond1.30"
    [6] =>
    string(6) "10.141"
    [7] =>
    string(8) "bond1.31"
  }
  [1] =>
  array(6) {
    [0] =>
    string(69) "O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  "
    [1] =>
    string(1) "O"
    [2] =>
    string(1) "E"
    [3] =>
    string(9) "10.112/23"
    [4] =>
    string(6) "10.140"
    [5] =>
    string(8) "bond1.30"
  }
  [2] =>
  array(6) {
    [0] =>
    string(69) "O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  "
    [1] =>
    string(1) "O"
    [2] =>
    string(1) "E"
    [3] =>
    string(9) "10.112/23"
    [4] =>
    string(6) "10.140"
    [5] =>
    string(8) "bond1.30"
  }
  [3] =>
  array(6) {
    [0] =>
    string(70) "O IA      10.138/29       via 10.140, bond1.30, cost 46, age 1029440  "
    [1] =>
    string(1) "O"
    [2] =>
    string(2) "IA"
    [3] =>
    string(9) "10.138/29"
    [4] =>
    string(6) "10.140"
    [5] =>
    string(8) "bond1.30"
  }
}

它不起作用,因为缺少第三个过孔

新版本,逐行:

$txt = "Codes: C - Connected, S - Static, R - RIP, B - BGP,
       O - OSPF IntraArea (IA - InterArea, E - External, N - NSSA)
       A - Aggregate, K - Kernel Remnant, H - Hidden, P - Suppressed,
       U - Unreachable, i - Inactive

O E       0.0.0.0/0           via 10.140, bond1.30, cost 1:10, age 5  
                              via 10.141, bond1.31  
                              via 10.142, bond1.32  
O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  
O E       10.112/23       via 10.140, bond1.30, cost 46:1, age 2511  
O IA      10.138/29       via 10.140, bond1.30, cost 46, age 1029440  
C         10.141/29    is directly connected, bond2.35
C         10.141/29    is directly connected, bond2.35   
";

$grouped = [];
$i = 0;
foreach (explode("\n", $txt) as $line) {
    $matches = [];
    if (preg_match('#^(.) +([A-Z]{1,2}) +([\d.]+/\d+) +via ([\d.]+), ([a-zA-Z0-9.]+)#', $line, $matches)) {
        array_shift($matches);
        $grouped[++$i] = $matches;
    } else if(preg_match('#^ +via ([\d.]+), ([a-zA-Z0-9.]+)#', $line, $matches)){
        array_push($grouped[$i], $matches[1], $matches[2]);
    }
}
var_dump($grouped);

现在它正在工作:

array(4) {
  [1] =>
  array(9) {
    [0] =>
    string(1) "O"
    [1] =>
    string(1) "E"
    [2] =>
    string(9) "0.0.0.0/0"
    [3] =>
    string(6) "10.140"
    [4] =>
    string(8) "bond1.30"
    [5] =>
    string(6) "10.141"
    [6] =>
    string(8) "bond1.31"
    [7] =>
    string(6) "10.142"
    [8] =>
    string(8) "bond1.32"
  }
  [2] =>
  array(5) {
    [0] =>
    string(1) "O"
    [1] =>
    string(1) "E"
    [2] =>
    string(9) "10.112/23"
    [3] =>
    string(6) "10.140"
    [4] =>
    string(8) "bond1.30"
  }
  [3] =>
  array(5) {
    [0] =>
    string(1) "O"
    [1] =>
    string(1) "E"
    [2] =>
    string(9) "10.112/23"
    [3] =>
    string(6) "10.140"
    [4] =>
    string(8) "bond1.30"
  }
  [4] =>
  array(5) {
    [0] =>
    string(1) "O"
    [1] =>
    string(2) "IA"
    [2] =>
    string(9) "10.138/29"
    [3] =>
    string(6) "10.140"
    [4] =>
    string(8) "bond1.30"
  }
}

这篇关于正则表达式在 PHP 中重复捕获组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆