C#中的标记字符串 [英] Tokenized String in C#

查看:105
本文介绍了C#中的标记字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

亲爱的,

我有如下字符串



字符串x =SLA  -  030 ','<   description  >  '(IfcSlab); 





所以我需要将该字符串标记为3部分

第1部分 - SLA - 030

第二部分 - < description>

第三部分 - IfcSlab

以及那些数据存储在3变量中。请帮我这样做



谢谢

Ruwan Atapattu

解决方案

假设以下语法:

 line = record,tag arg 
record = any-char-except-tick +'
tag ='any-char-except-tick +'
arg =(any-char-except-lparen *)





语法部分 正则表达式 评论
实线 ^ ...


从开头( ^ )到结束的整行(

分隔空格 \ * * 零或多个空格,制表符等
逗号 一个逗号
记录 (。+? )' 至少在char(。+ )上,尽可能少(),captur组((...))后面跟一个勾号('
标签 '(。+?)' 以刻度形式包含的一些文字('...'),附带的文本在一个组中捕获((...))并且必须包含至少一个字符(。+ ),匹配尽可能少(
arg \((。*?)\) 附上的一些文字括号( \(... \)),附带的文本在一个组中捕获((...))并且必须包含零个或多个字符(。* ),尽可能少匹配(




使用逐字字符串文字( @)更容易。 ..)编写正则表达式模式,因为在逐字字符串文字中,反斜杠没有特殊含义。

  var  match = Regex.Match(x, @  ^ \ * *(。+?)'\ * *,\ * *'(。+?)'\ s * \ ((。*?)\)\s * 

Dear All,
I have String like below

String x= "SLA - 030', '<description>' (IfcSlab)";



So i need to tokenized that string to 3 part
1st part - SLA - 030
2nd part - <description>
3rd part- IfcSlab
and also those data store to be in 3 variable. Pls help me to do this

Thanks
Ruwan Atapattu

解决方案

Assuming the following grammar:

line   = record "," tag arg
record = any-char-except-tick+ "'"
tag    = "'" any-char-except-tick+ "'"
arg    = "(" any-char-except-lparen* ")"



Grammar PartRegexComment
full line^...


full line from begin (^) to end (


)
separating white spaces\s*zero or more space, tab, etc.
comma,one comma
record(.+?)'at least on char (.+), as little as possible (?), capturing group ((...)) directly followed by a tick (')
tag'(.+?)'some text enclosed in ticks ('...'), the enclosed text is captured in a group ((...)) and must consist of at least one character (.+), match as little as possible (?)
arg\((.*?)\)some text enclosed in parenthesis (\(...\)), the enclosed text is captured in a group ((...)) and must consist of zero or more characters (.*), match as little as possible (?)


It's easier to use verbatim string literals (@"...") to write the regex pattern since in verbatim string literals the back slash has no special meaning.

var match = Regex.Match(x, @"^\s*(.+?)'\s*,\s*'(.+?)'\s*\((.*?)\)\s*


这篇关于C#中的标记字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆