从字符串定期EX pressions在.NET中提取令牌 [英] Extracting tokens from a string with regular expressions in .NET

查看:104
本文介绍了从字符串定期EX pressions在.NET中提取令牌的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很好奇,如果这甚至有可能与正则表达式。我想从类似于一个字符串中提取标记:

I'm curious if this is even possible with Regex. I want to extract tokens from a string similar to:

Select a [COLOR] and a [SIZE].

好,很容易 - 我可以使用(\ [AZ] + \])

不过,我也想提取的标记之间的文本。基本上,我想匹配的组为上述是:

However, I want to also extract the text between the tokens. Basically, I want the matched groups for the above to be:

"Select a "
"[COLOR]"
" and a "
"[SIZE]"
"."

什么是我们的最佳方法呢?如果有一种方法用正则表达式做到这一点,那将是巨大的。否则,我猜我已经通过MatchCollection提取令牌,然后手动循环,并解析出根据每场比赛的索引和长度的字符串。请注意,我需要preserve字符串和标记的顺序。有没有更好的算法,做这样的字符串解析的?

What's the best approach for this? If there's a way to do this with RegEx, that would be great. Otherwise, I'm guessing I have to extract the tokens, then manually loop through the MatchCollection and parse out the substrings based on the indexes and lengths of each Match. Please note I need to preserve the order of the strings and tokens. Is there a better algorithm to do this sort of string parsing?

推荐答案

使用 Regex.Split(S,@(\ [AZ] + \])) - 它应该给你,你以后的确切阵列。拆分需要捕获组,并将其转换为标记的结果数组中开始。

Use Regex.Split(s, @"(\[[A-Z]+\])") - it should give you the exact array you're after. Split takes captured groups and converts them to tokens in the result array.

这篇关于从字符串定期EX pressions在.NET中提取令牌的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆