解析2个字之间的文字 [英] Parse text between 2 words

查看:91
本文介绍了解析2个字之间的文字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可以肯定这已经被其他人问到了,但是我在这里搜索过,却没有找到 https://stackoverflow.com/search?q=php+parse+between+words

For sure this has already been asked by someone else, however I've searched here on SO and found nothing https://stackoverflow.com/search?q=php+parse+between+words

我有一个字符串,想要获取一个数组,其中所有单词都包含在2个定界符(2个单词)之间.我对正则表达式没有信心,所以我最终提出了这种解决方案,但是它不合适,因为我需要获得所有符合那些要求的单词,而不仅仅是第一个.

I have a string and want to get an array with all the words contained between 2 delimiters (2 words). I am not confident with regex so I ended up with this solution, but it is not appropiate because I need to get all the words that match those requirements and not only the first one.

$start_limiter = 'First';
$end_limiter = 'Second';
$haystack = $string;

# Step 1. Find the start limiter's position

$start_pos = strpos($haystack,$start_limiter);
if ($start_pos === FALSE)
{
    die("Starting limiter ".$start_limiter." not found in ".$haystack);
}

# Step 2. Find the ending limiters position, relative to the start position

$end_pos = strpos($haystack,$end_limiter,$start_pos);

if ($end_pos === FALSE)
{
    die("Ending limiter ".$end_limiter." not found in ".$haystack);
}

# Step 3. Extract the string between the starting position and ending position
# Our starting is the position of the start limiter. To find the string we must take
# the ending position of our end limiter and subtract that from the start limiter
$needle = substr($haystack, $start_pos+1, ($end_pos-1)-$start_pos);

echo "Found $needle";

我也考虑过使用explode(),但我认为正则表达式可能会更好,更快.

I thought also about using explode() but I think a regex could be better and faster.

推荐答案

我对PHP不太熟悉,但是在我看来,您可以使用类似这样的东西:

I'm not much familiar with PHP, but it seems to me that you can use something like:

if (preg_match("/(?<=First).*?(?=Second)/s", $haystack, $result))
    print_r($result[0]);

(?<=First)向后寻找First,但不消耗它,

(?<=First) looks behind for First but doesn't consume it,

.*?捕获FirstSecond之间的所有内容,

.*? Captures everything in between First and Second,

(?=Second)会预见Second,但不会消耗它,

(?=Second) looks ahead for Second but doesn't consume it,

最后的s是使点.匹配换行符.

The s at the end is to make the dot . match newlines if any.

要获取这些定界符之间的文本所有,请使用preg_match_all,并可以使用循环获取每个元素:

To get all the text between those delimiters, you use preg_match_all and you can use a loop to get each element:

if (preg_match_all("/(?<=First)(.*?)(?=Second)/s", $haystack, $result))
    for ($i = 1; count($result) > $i; $i++) {
        print_r($result[$i]);
    }

这篇关于解析2个字之间的文字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆