正则表达式匹配给定集合的任何子集? [英] Regular expression matching any subset of a given set?

查看:561
本文介绍了正则表达式匹配给定集合的任何子集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以编写一个与给定字符集的任何子集匹配的正则表达式?
a1 ... an?
IE.它应该与这些字符中最多出现一次的任何字符串匹配,没有其他字符,并且字符的相对顺序无关紧要.

Is it possible to write a regular expression which will match any subset of a given set of characters
a1 ... an ?
I.e. it should match any string where any of these characters appears at most once, there are no other characters and the relative order of the characters doesn't matter.

同时出现的一些方法:
1. [a1,...,an]*(a1|a2|...|an)*-这允许多个字符出现
2. (a1?a2?...an?)-没有多重存在,但是相对顺序很重要-它匹配任何子序列,但不匹配子集.
3. ($|a1|...|an|a1a2|a2a1|...|a1...an|...|an...a1),即写所有可能的子序列(只对所有匹配的字符串进行硬编码即可).

Some approaches that arise at once:
1. [a1,...,an]* or (a1|a2|...|an)*- this allows multiple presence of characters
2. (a1?a2?...an?) - no multiple presence, but relative order is important - this matches any subsequence but not subset.
3. ($|a1|...|an|a1a2|a2a1|...|a1...an|...|an...a1), i.e. write all possible subsequences (just hardcode all matching strings :)) of course, not acceptable.

我也有一个猜测,从理论上讲这是不可能的,因为在解析字符串时,我们将需要记住我们之前已经遇到过的哪个字符,据我所知,正则表达式只能检出直角语言.

I also have a guess that it may be theoretically impossible, because during parsing the string we will need to remember which character we have already met before, and as far as I know regular expressions can check out only right-linear languages.

任何帮助将不胜感激.预先感谢.

Any help will be appreciated. Thanks in advance.

推荐答案

无法考虑如何使用单个正则表达式,但这是使用n正则表达式的一种方法:(我将使用usr 2 ... m n等,用于a s)

Can't think how to do it with a single regex, but this is one way to do it with n regexes: (I will usr 1 2 ... m n etc for your as)

^[23..n]*1?[23..n]*$
^[13..n]*2?[13..n]*$
...
^[12..m]*n?[12..m]*$

如果以上所有条件均匹配,则您的字符串是12..mn的严格子集.

If all the above match, your string is a strict subset of 12..mn.

这是如何工作的:每行都要求字符串完全由 组成:

How this works: each line requires the string to consist exactly of:

  • 从集合中提取的任意数量的字符,除了a particular one
  • 也许a particular one
  • 从集合中提取的任意数量的字符,除了a particular one
  • any number of charactersm drawn fromthe set, except a particular one
  • perhaps a particular one
  • any number of charactersm drawn fromthe set, except a particular one

如果在依次将每个元素都视为a particular one时通过了此操作,则我们知道:

If this passes when every element in turn is considered as a particular one, we know:

  • 除了允许的元素外,字符串中没有其他内容
  • 每个允许的元素中最多有一个

根据需要.

为了完整起见,我应该说,只有在受到使用正则表达式"的命令的情况下,我才会这样做;如果没有,我将跟踪已看到哪些允许的元素,并遍历字符串中的字符以完成明显的工作.

for completeness I should say that I would only do this if I was under orders to "use regex"; if not, I'd track which allowed elements have been seen, and iterate over the characters of the string doing the obvious thing.

这篇关于正则表达式匹配给定集合的任何子集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆