用逗号解析变量 [英] Parsing variables with commas

查看:84
本文介绍了用逗号解析变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,我需要解析一个字符串中的变量定义,并想使用正则表达式来做到这一点.我正在VB中编程,但是也欢迎使用C#解决方案.

一些可能的来源:

Hello, I need to parse a variable definition in a string and want to do this with regex. I am programming in VB, but C#-solutions are welcome too.

Some of the possible sources:

DEF BOOL LB_TEST1
DEF BOOL LB_TEST2, LB_TEST3
DEF BOOL LB_TEST4[10], LB_TEST5[10,5]

DEF INT LI_TEST7
DEF INT LI_TEST8[3,4,5], LI_TEST9[3], LI_TEST10

DEF CHAN BOOL CB_TEMP1, CB_TEMP2
...
DEF NCK INT NI_TEMP5
...


如您所见,我们每次都以DEF(这是定义的关键字)开始
然后出现(如果不存在的话)一种范围(nothing/CHAN/NCK)
然后跟随变量名称,以逗号分隔.如果它们不是标量,则它们也可以具有数组定义.
变量名称必须以两个字符或"_"开头,然后可以有数字.

我需要的是每行所有变量的列表,它们的数据类型和数组描述:


As you can see, we start every time with DEF (thats the keyword for a definition)
Then comes (if not absent) a kind of scope (nothing/CHAN/NCK)
Then follows the variables name, separated by commas. They also can have an array definition if they are not scalar.
The variables name must start with two chars or "_", then there can be numbers.

What I need is a list of all the variables per line, their datatype and array description:

BOOL LB_TEST1
BOOL LB_TEST2 and LB_TEST3
BOOL LB_TEST4 with 10 and LB_TEST5 with 10,5

INT LI_TEST7
INT LI_TEST8 with 3,4,5 and LI_TEST9 with 3 and LI_TEST10
...



我已经有一些RegEx,但是在逗号后找不到分隔变量的变量.也许有更好的方法来解决DEF BOOL, DEF INT, DEF CHAN BOOL, DEF CHAN INT, DEF NCK BOOL, DEF NCK INT的置换问题(还有更多的数据类型,例如字符串,字符,...):

这是我无法正常工作的部分:



I already have some RegEx, but it doesn''t find the variables after the comma, that separates the variables. And there is perhaps a better way to solve the permutation problem of DEF BOOL, DEF INT, DEF CHAN BOOL, DEF CHAN INT, DEF NCK BOOL, DEF NCK INT (there are more datatypes like string, char, ...):

Here is my not properly working part:

\b(def\s+int|def\s+bool|def\s+chan\s+int|def\s+chan\s+bool|def\s+nck\s+int|def\s+nck\s+bool)\s+(?<names>([\w_]{2,}\d*(\[[\d,]+\])?[\,\s]?)+)


如果我针对DEF INT LI_STRING1, LI_STRING2进行检查,则仅匹配LI_STRING1
如果我针对DEF INT LI_STRING1[34,5], LI_STRING2进行检查,则它仅与LI_STRING1 with 34,5
匹配
一个重要的部分是工作,但不是全部.
我做错了什么?
感谢您提前提供帮助!

问候T_uRRiCA_N


If I check this against DEF INT LI_STRING1, LI_STRING2 it matches only LI_STRING1
If I check this against DEF INT LI_STRING1[34,5], LI_STRING2 it matches only LI_STRING1 with 34,5

One essential part is working, but not the whole thing.
What am I doing wrong?
Thanks for any help in advance!

Greetings T_uRRiCA_N

推荐答案

您在这里冒险了正则表达式的范围之外.您应该真正考虑构建一个小的 DSL [
You''ve ventured beyond the boundaries of Regular Expressions here. You should really consider building a small DSL[^] parser that fits your purposes.

  1. ANTLR [ ProGrammar [ Lex& ; Yacc [ ^ ]
  2. 编译器列表 [ ^ ]

  1. ANTLR[^]
  2. ProGrammar[^]
  3. Lex & Yacc[^]
  4. List of Compiler-Compilers[^]



干杯!如果您仍有疑问,请给我评论.

-MRB



Cheers! Leave me a comment if you still have doubts.

-MRB


好吧,我首先要对字符串进行规范化",以便您始终以相同的方式来分隔数据:

0)调用string.Replace(,",")

1)遍历字符串以寻找右括号('']'')字符.

3)如果找到一个,则获取由每个字符组成的子字符串,直到第一个左括号(''['').

4)用", "
替换子字符串中的所有","
5)根据需要重复

最后,您可以调用string.Split(", ");并按照需要的方式获取所有字符串部分.

正则表达式不是合适的解决方案.
Well, I would first "normalize" the string so that you''re always woirking with the data delimited same same way:

0) Call string.Replace(", ", "")

1) Traverse the string looking for a right bracket ('']'') character.

3) If you find one, get the substring comprised of every character up to the first left bracket (''['').

4) Replace all of the "," in the substring with ", "

5) Repeat as necessary

Finally, you can call string.Split(", "); and get all of your string parts in the manner you need them.

Regex is not an appropriate solution.


这篇关于用逗号解析变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆