正则表达式可选单词匹配 [英] regex optional word match

查看:81
本文介绍了正则表达式可选单词匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个用于提取歌手、作词者的正则表达式.我想知道如何让歌词搜索成为可选.

I'm trying to create a regex for extracting singers, lyricists. I was wondering how to make lyricists search optional.

示例多行字符串:

Fireworks Singer: Katy Perry
Vogue Singers: Madonna, Karen Lyricist: Madonna

正则表达式:/Singers?:(.\*)\s?Lyricists?:(.\*)/

这正确匹配第二行并提取Singers(Madonna, Karen)Lyricists(Madonna)

This matches the second line correctly and extracts Singers(Madonna, Karen) and Lyricists(Madonna)

但是当没有作词者时,它不适用于第一行.

But it does not work with the first line, when there are no Lyricists.

如何让歌词搜索成为可选?

How do I make Lyricists search optional?

推荐答案

您可以将要匹配的部分包含在非捕获组中:(?:).然后它可以被视为正则表达式中的单个单元,随后您可以在它后面放置一个 ? 以使其成为可选.示例:

You can enclose the part you want to match in a non-capturing group: (?:). Then it can be treated as a single unit in the regex, and subsequently you can put a ? after it to make it optional. Example:

/Singers?:(.*)\s?(?:Lyricists?:(.*))?/

注意这里的 \s? 是没有用的,因为 .* 会贪婪地吃掉所有的字符,不需要回溯.这也意味着 (?:Lyricists?:(.*)) 部分将永远不会因为同样的原因被匹配.您可以使用 .*.*?$ 的非贪婪版本来解决此问题:

Note that here the \s? is useless since .* will greedily eat all characters, and no backtracking will be necessary. This also means that the (?:Lyricists?:(.*)) part will never be matched for the same reason. You can use the non-greedy version of .*, .*? along with the $ to fix this:

/Singers?:(.*?)\s*(?:Lyricists?:(.*))?$/

一些额外的空白最终被捕获;这也可以删除,给出最终的正则表达式:

Some extra whitespace ends up captured; this can be removed also, giving a final regex of:

/Singers?:\s*(.*?)\s*(?:Lyricists?:\s*(.*))?$/

这篇关于正则表达式可选单词匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆