帮助验证我对正则表达式编码的理解 [英] Help needed to verify my understanding of regex coding

查看:72
本文介绍了帮助验证我对正则表达式编码的理解的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是正则表达式的真正新手。我有一个银行业务计划,它有导入交易的正则表达规则。但有时规则不正常。我试图理解规则的含义,这样我就可以按照他们应该的方式工作。



我想我已经找到了一些,但我只需要验证我是否在正确的轨道上。



这是我的信用卡对帐单上的实际措辞:

AT& T * BILL PAYMENT 800-331-0500 TX



这是我为银行业务计划评估的正则表达式规则:

AT& T\ * BILL PAYMENT + [0-9] [^ \p { L}] * T + [0-9] [^ \p {L}] * S + SQPZ



这是我理解正则表达式规则的方式:(如果我理解正确的表达方式,这就是我想知道的。



AT& T (要显示的实际文本) \ (转义字符,以便显示下一个符号( * )) BILL PAYMENT (要显示的实际文本) + [0- 9] (如果有数字然后显示这些数字,但它们存在很多次并按它们存在的顺序 - 由于没有任何数字我认为它放在9个空间而是) [^ \p {L}] (当使用 ^ 时,它可能意味着新行的开始。但如果它在方括号内使用则表示不 - 所以, \p {L}] ^ 一起使用时表示不是阿拉伯字母字母 {L} 或数字{N}) * T (我认为它代表Tab + [0-9] (如上所述,如果有数字,则显示这些数字,但它们存在的次数和存在顺序。) [^ \p {L}] (如上所示,它表示阿拉伯字母这不是一个字母{L}。 * S + SQPZ (我不知道这些是什么。看起来它们是应该在行尾显示的特定字母。 )



那么,我是否正好破译正则表达式?



感谢您提供的任何帮助!



Art

I'm a real newbie at regex. I have a banking program and it has regex rules for importing transactions. But sometimes the rules just don't work right. I'm trying to understand just what the rules are saying so that I can make them work the way they should.

I think I've figured out some of it but I just need some verification as to whether I'm on the right track or not.

Here's the actual wording that comes through on my credit card statement:
AT&T*BILL PAYMENT 800-331-0500 TX

Here's the regex rule that I evaluates for my banking program:
AT&T\*BILL PAYMENT +[0-9][^\p{L}]*T+[0-9][^\p{L}]*S+SQPZ

Here's the way that I understand the regex rule: (this is what I'd like to know if I'm understanding the expression rightly.

AT&T (Actual text to display) \ (escape character so next symbol (*) will display) BILL PAYMENT (Actual text to display) +[0-9] (If there are numbers then display those numbers however many times they exist and in the order they exist - Since there aren’t any numbers I THINK it put in 9 spaces instead) [^\p{L}] (when the ^ is used it can mean the start of a new line. But if it’s used inside a square bracket it means "not" - So, the \p{L}] when used with the ^ indicates an Arabic letter that is not a letter {L} or a number {N}) *T (I think it stands for a Tab +[0-9] (As above, if there are numbers then display those numbers however many times they exist and in the order they exist). [^\p{L}] (As indicated above, it indicates an Arabic letter that is not a letter {L}. *S+SQPZ (I don’t know what these stand for. It would seem that they are specific Letters that should be displayed at the end of the line.)

So, am I anywhere near right in deciphering the regex?

Thanks for whatever help comes!

Art

推荐答案

不完全,因为你的正则表达式与你的例子不符。可以吗修好了吗?



不!这是因为你错过了匹配的全部想法。 ular表达式不能仅基于单个文本样本构建。它涵盖了一些所有匹配的字符串,你没有定义,也许你不知道。



例如,我可以建议一个匹配你的正则表达式例如: AT& T \ * BILL PAYMENT [0-9] {3} - [0-9] {3} - [0-9] {4} TX 。但是你怎么知道它最终只能是TX呢?如果它应该是两个任意拉丁字母怎么办?那么它应该是 AT& T \ * BILL PAYMENT [0-9] {3} - [0-9] {3} - [0-9] {4} [AZ] {2} 。它也符合你的例子。但你怎么知道它应该是两个呢?你怎么知道它应该是大写的?等等......



你的问题是一个典型的问题,并没有真正制定出来。您需要知道格式的确切定义,该格式应涵盖所有可能字符串值的精确集合。它可以通过数学方式定义,也可以通过正则表达式定义。 :-)



-SA
Not quite, because your Regular Expression does not match your example. Can it be fixed?

No! This is because you are missing the whole idea of match. Regular expression cannot be built on just a single text sample. It covers some set of all matching strings, which you did not define and maybe you don't know it.

For example, I can suggest one Regex which matches your example: AT&T\*BILL PAYMENT [0-9]{3}-[0-9]{3}-[0-9]{4} TX. But how do you know that it can be only "TX" at the end? What if it should be two arbitrary Latin letters? Then it should be AT&T\*BILL PAYMENT [0-9]{3}-[0-9]{3}-[0-9]{4} [A-Z]{2}. It also matches your example. But how do you know that it should be two? How do you know it should be upper-case? And so on…

Your "problem" is a typical problem which is not really formulated. You need to know exact definition of the format, which should cover exact set of all possible string values. It can be defined mathematically, or… by a Regular Expression. :-)

—SA


这篇关于帮助验证我对正则表达式编码的理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆