Google RE2正则表达式转义期和下划线错误 [英] Google RE2 Regex Escaping periods and underscores error

查看:566
本文介绍了Google RE2正则表达式转义期和下划线错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试验证具有以下特征的username字符串:

  • 不是以._
  • 开头
  • 不以.
  • 结尾
  • 不允许连续两个.
  • 仅小写的letter charactersnumbers

我的代码是username.matches('^(?!\.)(?!_)(?!.*\.$)(?!.*?\.\.)[a-z0-9_.]+$')

在线使用正则表达式模拟器可以正常工作

https://regex101.com/r/bDXMg3/2/

但是在 Google RE2语法中使用相同的语法(用于 Firestore安全规则)抛出大量错误

我试图将每个.

都转义两次

使用代码username.matches('^(?!\\.)(?!_)(?!.*\\.$)(?!.*?\\.\\.)[a-z0-9_.]+$')

它仅显示一个错误(开头为红色^符号),但随后出现以下错误

Invalid regular expression pattern. Pattern: ^(?!\.)(?!_)(?!.*\.$)(?!.*?\.\.)[a-z0-9_.]+$.

谁能让我知道我在做什么错?

解决方案

RE2不支持先行(也不先行).

但是,该模式可以在没有环视的情况下进行重写:

^[a-z0-9][a-z0-9_]*([.][a-z0-9_]+)*$

详细信息

  • ^-字符串开头
  • [a-z0-9]-字母或数字
  • [a-z0-9_]*-零个或多个小写字母,数字或下划线
  • ([.][a-z0-9_]+)*-零个或多个序列
    • [.]-点
    • [a-z0-9_]+-一个或多个小写字母,数字或下划线
  • $-字符串结尾.

I'm trying to validate a username string with the following characteristics:

  • Not start with a . or _
  • Not end with a .
  • Don't allow two . in a row
  • Only lowercase letter characters and numbers

my code is username.matches('^(?!\.)(?!_)(?!.*\.$)(?!.*?\.\.)[a-z0-9_.]+$')

Using a regex simulator online it's working

https://regex101.com/r/bDXMg3/2/

But using the same syntax in Google RE2 Syntax (used in Firestore Security Rules) is throwing a ton of errors

I tried to then double escape each .

using the code username.matches('^(?!\\.)(?!_)(?!.*\\.$)(?!.*?\\.\\.)[a-z0-9_.]+$')

It only shows one error (red ^ sign at the beginning), but then it gives me the error below

Invalid regular expression pattern. Pattern: ^(?!\.)(?!_)(?!.*\.$)(?!.*?\.\.)[a-z0-9_.]+$.

Can anyone let me know what I'm doing wrong?

解决方案

RE2 does not support lookaheads (nor lookbehinds).

However, the pattern can be re-written without lookarounds:

^[a-z0-9][a-z0-9_]*([.][a-z0-9_]+)*$

Details

  • ^ - start of string
  • [a-z0-9] - a letter or digit
  • [a-z0-9_]* - zero or more lowercase letters, digits, or underscores
  • ([.][a-z0-9_]+)* - zero or more sequences of
    • [.] - a dot
    • [a-z0-9_]+ - one or more lowercase letters, digits, or underscores
  • $ - end of string.

这篇关于Google RE2正则表达式转义期和下划线错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆