如何使用Regex从字符串中获取子字符串? [英] How to get substring from string using Regex?
问题描述
我有一个大字符串,想从中得到一些子字符串。
假设字符串是:
string inputStr = [string1](string2)。[string3](string4 )。[string5](string6)。
子串结果是:
结果[ 0 ] = [string1](string2)。 跨度>;
结果[ 1 ] = [string3 ](串,4)跨度>。
结果[ 2 ] = [string5 ](string6)。;
使用正则表达式,)。[字符串被删除:
string [] result = Regex.Split(inputStr, @ \)\。\ [)
编辑:[MTHeffron]已添加这里来自解决方案1的评论
strings1,2,3 ....可以包含任何字符和标点符号。,!?[]()
你可以使用平衡组来做到这一点:
(?< Phrase1>(?< ; SB> \ [*)。?(小于?-sb> \]))(小于?Phrase2>(小于RB> \()*(小于?-rb> \ )\。))
[edit]
注意:'[',']',。(',')'和'。'都应该用反斜杠作为前缀,除非后跟'*'
[/ edit]
这会将你的输入捕获到分开的短语组:
Phrase1 =[string1],Phrase2 =(string2)。
Phrase1 =[string3],Phrase2 =(string4)。
Phrase1 =[string5],Phrase2 =(string6)。使用简单的子字符串删除外部位是一件小事,因为它们总是固定长度。
是的,它会处理你的字符串中的['[',']','(',')'和'。'(除非第二个短语也包含) 。在这种情况下,它可能会失败)。
尝试 - 它可能是你想要的!
如果你要玩复杂的正则表达式,那么得到一份< a href =http://www.ultrapico.com/Expresso.htm> Expresso [ ^ ] - 它是免费的,它会检查并生成正则表达式。
Bl ** dy markdown![ /编辑]
I have a large string and want to get some substrings from that.
Suppose the string is:
string inputStr="[string1](string2).[string3](string4).[string5](string6)."
Substring result is:
result[0]="[string1](string2).";
result[1]="[string3](string4).";
result[2]="[string5](string6).";
using Regex,the ").[" string is removed:
string[] result = Regex.Split(inputStr,@"\)\.\[")
Edit: [MTHeffron] Added here from comment on Solution 1
strings1,2,3.... can contain any character and punctuation marks such .,!?[]()
You can do it, using "Balancing Groups":
(?<Phrase1>(?<sb>\[).*?(?<-sb>\]))(?<Phrase2>(?<rb>\().*?(?<-rb>\)\.))
[edit]
NOTE: the '[', ']', .(', ')', and '.' should all be prefixed by a backslash, except when followed by a '*'
[/edit]
Which will capture your input into groups of separated phrases:
Phrase1 = "[string1]", Phrase2 = "(string2)." Phrase1 = "[string3]", Phrase2 = "(string4)." Phrase1 = "[string5]", Phrase2 = "(string6)."From which it's a trivial matter to remove the outside bits with a simple substring, since they will always be fixed length.
And yes, it'll cope with ['[', ']', '(', ')', and '.' inside your strings (unless the second phrase also contains ")." in which case it may fail).
Try it - it could be what you want!
If you are going to play with complex regexes, then get a copy of Expresso[^] - it's free, and it examines and generates Regular expressions.
[edit] Bl**dy markdown![/edit]
这篇关于如何使用Regex从字符串中获取子字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!