用于搜索短语周围单词集合的 PHP 正则表达式 [英] PHP regex for a word collection around a search phrase

查看:41
本文介绍了用于搜索短语周围单词集合的 PHP 正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个执行以下操作的正则表达式

Hi I am trying to create a regex that will do the following

从文本块中抓取搜索短语之前的 5 个单词(如果只有 x 个单词,则为 x)和搜索短语之后的 5 个单词(如果那里只有 x 个单词,则为 x)(当我说单词时)意思是文本块中的任何单词或数字)

grab 5 words before the search phrase (or x if there is only x words there) and 5 words after the search phrase (or x if there is only x words there) from a block of text (when I say words I mean words or numbers whatever is in the block of text)

例如

欢迎使用堆栈溢出!访问您的用户页面以设置您的姓名和电子邮件.

Welcome to Stack Overflow! Visit your user page to set your name and email.

如果您要搜索访问",它将返回:欢迎使用堆栈溢出!访问您的用户页面进行设置

if you was to search "visit" it would return: Welcome to Stack Overflow! Visit your user page to set

这个想法是在 php 中使用 preg_match_all 给我一堆搜索结果,显示搜索短语在文本中出现的每个搜索短语的位置.

the idea is to use preg_match_all in php to give me a bunch of search results showing where in the text the search phrase appears for each occurrence of the search phrase.

提前致谢:D

在子注释上,如果您觉得有更好的方法可以得到我的结果,请随意将其扔到池中,因为我不确定这是我想到的最好的方法,做我需要的:D

on a sub note there may be a better way to get to my result if you feel there is please feel free to throw it in the pool as I'm not sure this is the best just the first way I thought of, to do what I need :D

推荐答案

这个怎么样:

(\S+\s+){0,5}\S*\bvisit\b\S*(\s+\S+){0,5}

将在您的搜索词(在本例中为 visit)前后匹配五个词"(但如果文本较短,则接受较少).

will match five "words" (but accepting less if the text is shorter) before and after your search word (in this case visit).

preg_match_all(
    '/(\S+\s+){0,5} # Match five (or less) "words"
    \S*             # Match (if present) punctuation before the search term
    \b              # Assert position at the start of a word
    visit           # Match the search term
    \b              # Assert position at the end of a word
    \S*             # Match (if present) punctuation after the search term
    (\s+\S+){0,5}   # Match five (or less) "words"
    /ix', 
    $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];

我将单词"定义为一系列非空白字符,由至少一个空格分隔.

I'm defining a "word" as a sequence of non-whitespace characters, separated by at least one whitespace.

搜索词应该是实际的词(以字母数字字符开头和结尾).

The search words should be actual words (starting and ending with an alphanumeric character).

这篇关于用于搜索短语周围单词集合的 PHP 正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆