从字符串定期EX pressions在.NET中提取令牌 [英] Extracting tokens from a string with regular expressions in .NET
问题描述
我很好奇,如果这甚至有可能与正则表达式。我想从类似于一个字符串中提取标记:
I'm curious if this is even possible with Regex. I want to extract tokens from a string similar to:
Select a [COLOR] and a [SIZE].
好,很容易 - 我可以使用(\ [AZ] + \])
不过,我也想提取的标记之间的文本。基本上,我想匹配的组为上述是:
However, I want to also extract the text between the tokens. Basically, I want the matched groups for the above to be:
"Select a "
"[COLOR]"
" and a "
"[SIZE]"
"."
什么是我们的最佳方法呢?如果有一种方法用正则表达式做到这一点,那将是巨大的。否则,我猜我已经通过MatchCollection提取令牌,然后手动循环,并解析出根据每场比赛的索引和长度的字符串。请注意,我需要preserve字符串和标记的顺序。有没有更好的算法,做这样的字符串解析的?
What's the best approach for this? If there's a way to do this with RegEx, that would be great. Otherwise, I'm guessing I have to extract the tokens, then manually loop through the MatchCollection and parse out the substrings based on the indexes and lengths of each Match. Please note I need to preserve the order of the strings and tokens. Is there a better algorithm to do this sort of string parsing?
推荐答案
使用 Regex.Split(S,@(\ [AZ] + \]))
- 它应该给你,你以后的确切阵列。拆分需要捕获组,并将其转换为标记的结果数组中开始。
Use Regex.Split(s, @"(\[[A-Z]+\])")
- it should give you the exact array you're after. Split takes captured groups and converts them to tokens in the result array.
这篇关于从字符串定期EX pressions在.NET中提取令牌的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!