正则表达式验证逗号分隔的选项列表 [英] RegEx to validate a comma separated list of options

查看:85
本文介绍了正则表达式验证逗号分隔的选项列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 PHP 的过滤器函数(特别是 FILTER_VALIDATE_REGEXP)来验证输入数据.我有一个选项列表,$input 变量可以从列表中指定多个选项.

I'm using PHP's Filter Functions (FILTER_VALIDATE_REGEXP specifically) to validate the input data. I have a list of options and the $input variable can specify a number of options from the list.

选项是(不区分大小写):

The options are (case-insensitive):

  1. 全部
  2. 奖励
  3. 加入
  4. 促销活动
  5. 签到
  6. verified_checkin

$input 变量几乎可以有任何值的组合.可能的成功案例是:

The $input variable can have almost any combination of the values. The possible success cases are:

  • all (值可以是 all 逗号分隔的其他值列表,但不能同时是两者)
  • rewards,stream,join (逗号分隔的值列表排除 all)
  • 加入(单个值)
  • all (value can either be all or a comma separated list of other values but not both)
  • rewards,stream,join (a comma separated list of values excluding all)
  • join (a single value)

我能想出的正则表达式是:

The Regular Expression I've been able to come up with is:

/^(?:all|(?:checkin|verified_checkin|rewards|join|promotions|stream)?(?:,(?:checkin|verified_checkin|rewards|join|promotion|stream))*)$/

到目前为止,它适用于以下示例场景:

So far, it works for the following example scenarios:

  • all (通过)
  • rewards,join,promotion,checkin,verified_checkin (passes)
  • 加入 (通过)

但是,它允许带有前导逗号的值并通过:

However, it lets a value with a leading comma and duplicates through:

  • ,promotion,checkin,verified_checkin (以逗号开头但不应该通过)

此外,检查重复项也有好处,但不是必须的.

Also, checking for duplicates would be a bonus, but not necessarily required.

  • rewards,join,promotion,checkin,join,verified_checkin (重复值但仍然通过但不如前导逗号重要)

我已经研究了几天并尝试了各种实现,这是我能得到的最接近的.

I've been at it for a couple of days now and having tried various implementations, this is the closest I've been able to get.

关于如何处理前导逗号误报的任何想法?

Any ideas on how to handle the leading comma false positive?

更新: 编辑了问题以更好地解释重复过滤并不是真正的要求,只是一个奖励.

UPDATE: Edited the question to better explain that duplicate filtering isn't really a requirement, just a bonus.

推荐答案

有时正则表达式只会让事情变得比应有的复杂.正则表达式非常擅长匹配模式,但是当您引入依赖于匹配模式数量的外部规则时,事情会很快变得复杂.

Sometimes regular expressions just make things more complicated than they should be. Regular expressions are really good at matching patterns, but when you introduce external rules that have dependencies on the number of matched patterns things get complicated fast.

在这种情况下,我只会用逗号分割列表,并根据您刚刚描述的规则检查结果字符串.

In this case I would just split the list on comma and check the resulting strings against the rules you just described.

$valid_choices = array('checkin','join','promotions','rewards','stream','verified_checkin');

$input_string;                       // string to match

$tokens = explode(',' $input_string);

$tokens = asort($tokens);            // sort to tokens to make it easy to find duplicates

if($tokens[0] == 'all' && count($tokens) > 1)
    return FALSE;                    // fail (all + other options)

if(!in_array($tokens[0], $valid_choices))
    return FALSE;                    // fail (invalid first choice)

for($i = 1; $i < count($tokens); $i++)
{
    if($tokens[$i] == $tokens[$i-1])
       return FALSE;                 // fail (duplicates)

    if(!in_array($tokens[$i], $valid_choices))
       return FALSE;                 // fail (choice not valid)
}

编辑

由于您编辑并指定重复项是可以接受的,但您肯定想要一个基于正则表达式的解决方案,那么这个解决方案应该可以:

Since you edited your and specified that duplicates would be acceptable but you definitely want a regex-based solution then this one should do:

^(all|((checkin|verified_checkin|rewards|join|promotions|stream)(,(checkin|verified_checkin|rewards|join|promotion|stream))*))$

它不会在重复时失败,但会小心或前导或尾随逗号,或所有 + 其他选项组合.

It will not fail on duplicates but it will take care or leading or trailing commas, or all + other choices combination.

使用正则表达式过滤掉重复项会非常困难,但可能并非不可能(如果您使用带有捕获组占位符的前瞻)

Filtering out duplicates with a regex would be pretty difficult but maybe not impossible (if you use a look-ahead with a capture group placeholder)

第二次编辑

虽然您提到检测重复条目并不重要,但我想我会尝试制作一种模式,该模式也可以检查重复条目.

Although you mentioned that detecting duplicate entries is not critical I figured I'd try my hand at crafting a pattern that would also check for duplicate entries.

正如您在下面看到的,它不是很优雅,也不是很容易扩展,但它确实通过使用负前瞻的有限选项列表来完成工作.

As you can see below, it's not very elegant, nor is it easily scalable but it does get the job done with the finite list of options you have using negative look-ahead.

^(all|(checkin|verified_checkin|rewards|join|promotions|stream)(,(?!\2)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(?!\6)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(?!\6)(?!\8)(checkin|verified_checkin|rewards|join|promotions|stream))?(,(?!\2)(?!\4)(?!\6)(?!\8)(?!\10)(checkin|verified_checkin|rewards|join|promotions|stream))?)$

由于最后的正则表达式太长,我将把它分成几个部分,以便更容易理解总体思路:

Since the final regex is so long, I'm going to break it up into parts for the sake of making it easier to follow the general idea:

^(all|
  (checkin|verified_checkin|rewards|join|promotions|stream)
  (,(?!\2)(checkin|verified_checkin|rewards|join|promotions|stream))?
  (,(?!\2)(?!\4)(checkin|verified_checkin|rewards|join|promotions|stream))?
  (,(?!\2)(?!\4)(?!\6)(checkin|verified_checkin|rewards|join|promotions|stream))?
  (,(?!\2)(?!\4)(?!\6)(?!\8)(checkin|verified_checkin|rewards|join|promotions|stream))?
  (,(?!\2)(?!\4)(?!\6)(?!\8)(?!\10)(checkin|verified_checkin|rewards|join|promotions|stream))?
 )$/

您可以看到形成模式的机制在某种程度上是迭代的,如果您想提供不同的列表,这种模式可以由算法自动生成,但结果模式会变得相当大,而且很快.

You can see that the mechanism to form the pattern is somewhat iterative and such a pattern could be generated automatically by an algorithm if you wanted to provide a different list but the resulting pattern would get rather large, rather quickly.

这篇关于正则表达式验证逗号分隔的选项列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆