两个第 n 个位置字符之间的正则表达式 [英] Regex between two nth position characters

查看:43
本文介绍了两个第 n 个位置字符之间的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据位于两个字符 (_) 之间但可能是第 n 个位置的单词的文本字符串获取一些数据.

I'm trying to fetch some data depending from a text string that lies between two characters (_) but could be a word in a nth position.

目前我有以下内容

!((?:.*?(_)){2})_(.+?)$

处理以下数据

D20_Mbps_U10_Mbps_TC4_P

我希望得到的地方

U10

但在第一部分捕获时一无所获

but get nothing as the first part captures

D20_Mbps_

因此没有留下任何东西供第二部分捕获

and thus leaves nothing for the second part to capture

我试过了

_\s*(.*?)(?=\s*_)

但这只会让我第一次出现我需要它成为第 n 个位置的情况.我可以在运行时提供 n 的地方.

But this only gives me the first occurance where I need it to be nth position. Where I can supply n at runtime.

有什么想法吗?

谢谢

推荐答案

让我试着详细回答这个问题.

Let me try answering this in detail.

当你想在一个分隔的字符串中匹配第 N 个出现的子字符串时,你真的应该考虑一些 String.Split 函数.在您的情况下,使用 _ 拆分并获取您需要的值是一项微不足道的任务.

When you want to match some Nth occurrence of a substring within a delimited string, you should really think of some String.Split function. In your case, splitting with _ and getting the values you need is a trivial task.

现在,当您无法使用编程方法提取该值时,您只能使用限制量词、分组和捕获(在 Java 和 .NET 中,可以实现即使不捕获也一样).

Now, when you cannot use a programming means to extract that value, you can only do this with a limiting quantifier, grouping and capturing (in Java and .NET, it is possible to achieve the same even without capturing).

因此,主要思想是匹配除分隔符以外的 0 个或多个字符,然后匹配分隔符本身,然后 重复相同的 N-1 次.然后,再次匹配分隔符并捕获以下非分隔符字符.

So, the main idea is to match 0 or more characters other than your delimiter and then match the delimiters itself, and then repeat the same N-1 times. Then, just match the delimiter again and capture following non-delimiter characters.

^(?:[^_]*_){2}([^_]*)

请参阅演示.第 1 组将包含 U10.

See demo. Group 1 will contain U10.

另一种变体:

^(?:[^_]*_){2}([^_]*)_(.+)$

这会将第 3 个 _ 分隔的元素捕获到第 1 组中.在这种情况下,第 2 组是第 4 个+元素,字符串的其余部分直到末尾.

This will capture the 3rd _-delimited element into Group 1. Group 2 in this case is the 4th+ elements, the rest of the string up to the end.

请注意,在某些正则表达式中,{( 必须进行转义(vim、具有非 EGREP 版本的 sed 等).

Note that in some regex flavors { and ( must be escaped (vim, sed with non-EGREP versions, etc.).

这篇关于两个第 n 个位置字符之间的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆