正则表达式适用于 regex101.com,但不适用于 prod [英] Regular expression works on regex101.com, but not on prod
问题描述
所有语言 - 文字串"与字符串文字"alert
- 确保在正则表达式测试器中针对代码中使用的相同文本进行测试,文字字符串.一个常见的场景是将字符串文字值直接复制/粘贴到测试字符串字段,包含所有字符串转义序列,如 \n
(换行符)、\r
(回车)、\t
(制表符).例如,请参见 Regex_search c++.请注意,它们必须替换为它们的字面对应物.所以,如果你在 Python 中有 text = "Text\n\n abc"
,你必须使用 Text
,两个换行符,abc
在正则表达式测试器文本字段中.Text.*?abc
永远不会匹配它尽管你.
并不总是匹配换行符,请参阅 如何匹配一个文件中多行的任何字符正则表达式?
所有语言 - 反斜杠警报
- 确保在您的字符串文字中正确使用反斜杠,在大多数语言中,常规字符串文字,使用双反斜杠,即regex101.com使用的\d
必须写成\\d
.在原始字符串文字中,使用单个反斜杠,与 regex101 相同.转义词边界非常重要,因为在许多语言中(C#、Python、Java, JavaScript、Ruby、等),"\b"
用于定义一个退格字符,即它是一个有效的字符串转义序列.PHP 不支持 \b
字符串转义序列,所以 "/\b/"
= '/\b/'
在那里.
所有语言 - 默认标志 - 全局和多行
- 注意默认情况下 m
和 g
标志在 regex101.com 上启用.因此,如果您使用 ^
和 $
,它们将相应地在行首和行尾匹配.如果您在代码中需要相同的行为,请检查多行模式的实现方式,并使用特定标志,或者 - 如果支持 - 使用内联 (?m)
嵌入(内联)修饰符.g
标志启用多次匹配,它通常使用特定的函数/方法来实现.检查您的语言参考以找到合适的语言.
换行符 - regex101.com 上的 行结尾 仅是 LF,您不能测试带有 CRLF 结尾的字符串,请参阅 regex101.com VS myserver - 不同的结果.每个正则表达式库的解决方案可能不同:使用 \R
(PCRE、Java、Ruby)或某种 \v
(Boost、PCRE)、\r?\n
, (?:\r\n?|\n)
/(?>\r\n?|\n)
(适用于 .NET)或其他库中的 [\r\n]+
(请参阅 C#、PHP).
与针对多行字符串(不是独立字符串/行列表)测试正则表达式这一事实相关的另一个问题是,您的模式可能会占用行尾、\n
、char对于否定字符类,请参阅类似问题.\D
匹配行尾字符,为了避免它,可以使用 [^\d\n]
或其他替代方法.
php
- 您正在处理 Unicode 字符串,或者想要速记字符类来匹配 Unicode 字符(例如 \w+
以匹配 Стрибижев
或 Stribiżew
,或 \s+
匹配硬空格),那么你需要使用 u
修饰符,参见 preg_match() 返回 0 尽管正则表达式测试人员工作
- 要匹配所有出现,请使用 preg_match_all
,不是带有 /...pattern.../g
的 preg_match
,参见 PHP preg_match 查找多次出现 和 未知修饰符 'g' in...";在 PHP 中使用 preg_match 时?
- 您的带有内联反向引用的正则表达式如 \1
拒绝工作?您是否使用双引号字符串文字?使用单引号,参见 反向引用在 PHP 中不起作用
phplaravel
- 注意你需要围绕模式的正则表达式分隔符,见 https://stackoverflow.com/questions/22430529
python
- 您使用了 re.match
只搜索字符串开头的匹配项,使用 re.search
:Regex 在 Pythex 上工作正常,但在 Python 中无效
- 如果 regex 包含捕获组(s), re.findall
返回捕获/捕获元组列表.要么使用非捕获组,要么使用 re.finditer
,或者删除多余的捕获组,请参阅 re.findall 的行为很奇怪
- 如果你在模式中使用 ^
来表示一行的开始,而不是整个字符串的开始,或者使用 $
来表示结束对于一行而不是字符串,将 re.M
或 re.MULTILINE
标志传递给 re
方法,参见 使用 ^ 匹配 Python 正则表达式中的行首
- 如果您尝试跨多行匹配某些文本,并使用 re.DOTALL
或 re.S
或 [\s\S]*
/[\s\S]*?
,仍然没有任何效果,检查您是否逐行读取文件,例如,使用 for line in文件:
.您必须将整个文件内容作为输入传递给 regex 方法,请参阅获取跨新行的两个字符之间的所有内容.
c#, .net
- .NET 正则表达式不支持占有量词,如++
、*+
、??
、{1,10}?
,请参阅.NET 正则表达式匹配具有所有格量词的可选文本之间的数字不起作用
- 当您匹配多行字符串并使用 RegexOptions.Multiline
选项(或内联 (?m)
修饰符)和模式中的 $
锚点匹配整行,代码不匹配,需要在$
前加\r?
,见.Net 正则表达式匹配 $ 与字符串的结尾而不是行,即使启用了多行
- 获得 multiple 匹配,使用Regex.Matches
,而不是Regex.Match
,参见RegEx 在字符串中多次匹配
- 与上面类似的情况:通过双换行符序列将字符串拆分为段落 - C#/Regex 模式适用于在线测试,但不适用于运行时
- 您应该删除正则表达式分隔符,即 @"/\d+/"
实际上必须看起来像@"\d+"
,参见 简单且经过测试的包含正则表达式分隔符的在线正则表达式在 C# 代码中不起作用
- 如果您不必要地使用 Regex.Escape
来转义正则表达式中的所有字符(例如 Regex.Escape(@"\d+\.\d+")
) 你需要删除 Regex.Escape
,参见 Regular Expression 在 regex tester 中工作, 但不是在 c# 中
dartflutter
- 使用原始字符串文字,RegExp(r"\d")
或双反斜杠 (RegExp("\\d")
) - https://stackoverflow.com/questions/59085824
javascript
- RegExp("\\d")
中的双转义反斜杠:为什么需要对正则表达式构造函数进行双转义?
- 大多数浏览器不支持(否定)lookbehinds:正则表达式适用于浏览器,但不适用于 Node.js
- 字符串是不可变的,将 .replace
结果分配给 var - .replace() 方法确实更改了字符串
- 使用 str.match(/pat/g) 检索所有匹配)
- Regex101 和 Js 正则表达式搜索显示不同的结果,或者,使用 RegExp#exec
,RegEx 使用 RegExp.exec 从字符串中提取所有匹配项
- 替换 all 中的模式匹配项字符串:为什么javascript在使用replace时只替换第一个实例?
javascriptangular
- 如果你用字符串文字,或仅使用正则表达式文字符号,请参阅 https://stackoverflow.com/questions/56097782
java
- 字边界不起作用?确保使用双反斜杠,"\\b"
,请参阅 Regex \b 字边界不起作用
- 获取无效转义序列
异常?同样的事情,双反斜杠 - Java 不适用于正则表达式 \s,说:转义序列无效
- 未找到匹配项
是否在困扰您?运行 Matcher.find()
/Matcher.matches()
- 为什么我的正则表达式有效在 RegexPlanet 和 regex101 上但不在我的代码中?
- .matches()
需要完整的字符串匹配,使用 .find()
: Java 正则表达式模式在任何在线测试器中匹配但在 Eclipse 中不匹配
- 使用 matcher.group 访问组(x)
:正则表达式在 Java 中不起作用,而在其他情况下工作
- 在字符类中,两个 [
和 ]
必须转义 - 在 Java 正则表达式中的字符类中使用方括号
- 你不应该连续运行 matcher.matches()
和 matcher.find()
,只使用 if (matcher.matches()){...}
检查模式是否与整个字符串匹配d 然后相应地采取行动,或使用 if (matcher.find())
来检查是否有单个匹配项或使用 while (matcher.find())
来查找多个匹配项(或 Matcher#results()
).请参阅为什么我的正则表达式在 RegexPlanet 和 regex101 上有效,但在我的代码中无效?
kotlin
- 你有 Regex("/^\\d+$/")
?删除外斜杠,它们是正则表达式分隔符,不属于图案.请参阅在 Kotlin 中使用正则表达式在字符串中查找一个或多个单词
- 您希望部分字符串匹配,但是 .matchEntire
需要完整的字符串匹配吗?使用 .find
,参见 正则表达式在 Kotlin 中不匹配
mongodb
- 不要用单/双引号将 /.../
括起来,参见 mongodb regex 不起作用一个>
c++
- regex_match
需要完整的字符串匹配,使用 regex_search
找到部分匹配 - 正则表达式在 C++ regex_match 中无法按预期工作
- regex_search
仅找到第一个匹配项.使用 sregex_token_iterator
或 sregex_iterator
获取所有匹配项:请参阅 std::match_results::size 返回什么?
- 当您使用 std::string 输入读取用户定义的字符串时;std::cin >>input;
,注意cin
只会到达第一个空格,为了正确读取整行,使用std::getline(std::cin, input);
- C++ 正则表达式匹配 '+' 量词
- \d"
不起作用,您需要使用 "\\d"
或 R"(\d)"
(原始字符串文字) - 这个正则表达式在 c++ 中不起作用
- 确保正则表达式是针对文字而不是文本进行测试的字符串文字,参见 Regex_search c++
go
- 双反斜杠或使用原始字符串文字:正则表达式在 Go 中不起作用
- Go 正则表达式不支持环顾四周,在测试前在 regex101.com 上选择正确的选项 (Go
)!正则表达式否定集不起作用 golang
r
- 字符串文字中的双转义反斜杠:"'\w'是一个无法识别的转义"在 grep
- 使用 perl=TRUE
到 PCRE 引擎((g)sub
/(g)regexpr
):为什么这个正则表达式在R中使用lookbehinds无效?
oracle
- 所有量词的贪婪由正则表达式中的第一个量词设置,请参阅Regex101 vs Oracle Regex(然后,您需要将所有量词和第一个一样贪婪)
firebase
- 双转义反斜杠,确保 ^
只出现在模式的开头,$
只出现在结尾(如果有),注意你不能使用超过9 个内联反向引用:Firebase 规则正则表达式生日
firebasegoogle-cloud-firestore
- 在 Firestore 安全规则中,正则表达式需要作为字符串传递,这也意味着它不应包含在 /
符号中,即使用 allow create: if docId.匹配(^\\d+$")
....见https://stackoverflow.com/questions/63243300
google-data-studio
- REGEXP_REPLACE
中的 /pattern/g
不得包含 /
正则表达式分隔符和标志(如g
) - 请参阅如何使用正则表达式替换 Google 数据洞察中日期字段中的方括号?
google-sheets
- 如果您认为 REGEXEXTRACT
没有返回完全匹配,截断结果,您应该检查您的正则表达式中是否有多余的捕获组并删除它们,或将捕获组转换为非捕获通过在打开 (
后添加 ?:
,参见 Extract url domain root in谷歌表格
word-boundarypcrephp
- [[:<:]]
和 [[:>:]]
在正则表达式测试器中不起作用,尽管它们在 PCRE 中是有效的构造,请参阅 https://stackoverflow.com/questions/48670105
https://regex101.com/r/sB9wW6/1
(?:(?<=\s)|^)@(\S+)
<-- the problem in positive lookbehind
Working like this on prod
: (?:\s|^)@(\S+)
, but I need a correct start index (without space).
Here is in JS:
var regex = new RegExp(/(?:(?<=\s)|^)@(\S+)/g);
Error parsing regular expression: Invalid regular expression: /(?:(?<=\s)|^)@(\S+)/
What am I doing wrong?
UPDATE
Ok, no lookbehind in JS :(
But anyways, I need a regex to get the proper start and end index of my match. Without leading space.
Make sure you always select the right regex engine at regex101.com. See an issue that occurred due to using a JS-only compatible regex with [^]
construct in Python.
JS regex - at the time of answering this question - did not support lookbehinds. Now, it becomes more and more adopted after its introduction in ECMAScript 2018. You do not really need it here since you can use capturing groups:
var re = /(?:\s|^)@(\S+)/g;
var str = 's @vln1\n@vln2\n';
var res = [];
while ((m = re.exec(str)) !== null) {
res.push(m[1]);
}
console.log(res);
The (?:\s|^)@(\S+)
matches a whitespace or the start of string with (?:\s|^)
, then matches @
, and then matches and captures into Group 1 one or more non-whitespace chars with (\S+)
.
To get the start/end indices, use
var re = /(\s|^)@\S+/g;
var str = 's @vln1\n@vln2\n';
var pos = [];
while ((m = re.exec(str)) !== null) {
pos.push([m.index+m[1].length, m.index+m[0].length]);
}
console.log(pos);
BONUS
My regex works at regex101.com, but not in...
First of all, have you checked the Code Generator link in the Tools pane on the left?
All languages - "Literal string" vs. "String literal" alert
- Make sure you test against the same text used in code, literal string, at the regex tester. A common scenario is copy/pasting a string literal value directly into the test string field, with all string escape sequences like\n
(line feed char),\r
(carriage return),\t
(tab char). See Regex_search c++, for example. Mind that they must be replaced with their literal counterparts. So, if you have in Pythontext = "Text\n\n abc"
, you must useText
, two line breaks,abc
in the regex tester text field.Text.*?abc
will never match it although you might think it "works". Yes,.
does not always match line break chars, see How do I match any character across multiple lines in a regular expression?All languages - Backslash alert
- Make sure you correctly use a backslash in your string literal, in most languages, in regular string literals, use double backslash, i.e.\d
used at regex101.com must written as\\d
. In raw string literals, use a single backslash, same as at regex101. Escaping word boundary is very important, since, in many languages (C#, Python, Java, JavaScript, Ruby, etc.),"\b"
is used to define a BACKSPACE char, i.e. it is a valid string escape sequence. PHP does not support\b
string escape sequence, so"/\b/"
='/\b/'
there.All languages - Default flags - Global and Multiline
- Note that by defaultm
andg
flags are enabled at regex101.com. So, if you use^
and$
, they will match at the start and end of lines correspondingly. If you need the same behavior in your code check how multiline mode is implemented and either use a specific flag, or - if supported - use an inline(?m)
embedded (inline) modifier. Theg
flag enables multiple occurrence matching, it is often implemented using specific functions/methods. Check your language reference to find the appropriate one.line-breaks - Line endings at regex101.com are LF only, you can't test strings with CRLF endings, see regex101.com VS myserver - different results. Solutions can be different for each regex library: either use
\R
(PCRE, Java, Ruby) or some kind of\v
(Boost, PCRE),\r?\n
,(?:\r\n?|\n)
/(?>\r\n?|\n)
(good for .NET) or[\r\n]+
in other libraries (see answers for C#, PHP).
Another issue related to the fact that you test your regex against a multiline string (not a list of standalone strings/lines) is that your patterns may consume the end of line,\n
, char with negated character classes, see an issue like that.\D
matched the end of line char, and in order to avoid it,[^\d\n]
could be used, or other alternatives.php
- You are dealing with Unicode strings, or want shorthand character classes to match Unicode characters, too (e.g.\w+
to matchСтрибижев
orStribiżew
, or\s+
to match hard spaces), then you need to useu
modifier, see preg_match() returns 0 although regex testers work
- To match all occurrences, usepreg_match_all
, notpreg_match
with/...pattern.../g
, see PHP preg_match to find multiple occurrences and "Unknown modifier 'g' in..." when using preg_match in PHP?
- Your regex with inline backreference like\1
refuses to work? Are you using a double quoted string literal? Use a single-quoted one, see Backreference does not work in PHPphplaravel
- Mind you need the regex delimiters around the pattern, see https://stackoverflow.com/questions/22430529python
- You usedre.match
that only searches for a match at the start of the string, usere.search
: Regex works fine on Pythex, but not in Python
- If the regex contains capturing group(s),re.findall
returns a list of captures/capture tuples. Either use non-capturing groups, orre.finditer
, or remove redundant capturing groups, see re.findall behaves weird
- If you used^
in the pattern to denote start of a line, not start of the whole string, or used$
to denote the end of a line and not a string, passre.M
orre.MULTILINE
flag tore
method, see Using ^ to match beginning of line in Python regex
- If you try to match some text across multiple lines, and usere.DOTALL
orre.S
, or[\s\S]*
/[\s\S]*?
, and still nothing works, check if you read the file line by line, say, withfor line in file:
. You must pass the whole file contents as the input to the regex method, see Getting Everything Between Two Characters Across New Lines.c#, .net
- .NET regex does not support possessive quantifiers like++
,*+
,??
,{1,10}?
, see .NET regex matching digits between optional text with possessive quantifer is not working
- When you match against a multiline string and useRegexOptions.Multiline
option (or inline(?m)
modifier) with an$
anchor in the pattern to match entire lines, and get no match in code, you need to add\r?
before$
, see .Net regex matching $ with the end of the string and not of line, even with multiline enabled
- To get multiple matches, useRegex.Matches
, notRegex.Match
, see RegEx Match multiple times in string
- Similar case as above: splitting a string into paragraphs, by a double line break sequence - C# / Regex Pattern works in online testing, but not at runtime
- You should remove regex delimiters, i.e.@"/\d+/"
must actually look like@"\d+"
, see Simple and tested online regex containing regex delimiters does not work in C# code
- If you unnecessarily usedRegex.Escape
to escape all characters in a regular expression (likeRegex.Escape(@"\d+\.\d+")
) you need to removeRegex.Escape
, see Regular Expression working in regex tester, but not in c#dartflutter
- Use raw string literal,RegExp(r"\d")
, or double backslashes (RegExp("\\d")
) - https://stackoverflow.com/questions/59085824javascript
- Double escape backslashes in aRegExp("\\d")
: Why do regex constructors need to be double escaped?
- (Negative) lookbehinds unsupported by most browsers: Regex works on browser but not in Node.js
- Strings are immutable, assign the.replace
result to a var - The .replace() method does change the string in place
- Retrieve all matches withstr.match(/pat/g)
- Regex101 and Js regex search showing different results or, withRegExp#exec
, RegEx to extract all matches from string using RegExp.exec
- Replace all pattern matches in string: Why does javascript replace only first instance when using replace?javascriptangular
- Double the backslashes if you define a regex with a string literal, or just use a regex literal notation, see https://stackoverflow.com/questions/56097782java
- Word boundary not working? Make sure you use double backslashes,"\\b"
, see Regex \b word boundary not works
- Gettinginvalid escape sequence
exception? Same thing, double backslashes - Java doesn't work with regex \s, says: invalid escape sequence
-No match found
is bugging you? RunMatcher.find()
/Matcher.matches()
- Why does my regex work on RegexPlanet and regex101 but not in my code?
-.matches()
requires a full string match, use.find()
: Java Regex pattern that matches in any online tester but doesn't in Eclipse
- Access groups usingmatcher.group(x)
: Regex not working in Java while working otherwise
- Inside a character class, both[
and]
must be escaped - Using square brackets inside character class in Java regex
- You should not runmatcher.matches()
andmatcher.find()
consecutively, use onlyif (matcher.matches()) {...}
to check if the pattern matches the whole string and then act accordingly, or useif (matcher.find())
to check if there is a single match orwhile (matcher.find())
to find multiple matches (orMatcher#results()
). See Why does my regex work on RegexPlanet and regex101 but not in my code?kotlin
- You haveRegex("/^\\d+$/")
? Remove the outer slashes, they are regex delimiter chars that are not part of a pattern. See Find one or more word in string using Regex in Kotlin
- You expect a partial string match, but.matchEntire
requires a full string match? Use.find
, see Regex doesn't match in Kotlinmongodb
- Do not enclose/.../
with single/double quotation marks, see mongodb regex doesn't workc++
-regex_match
requires a full string match, useregex_search
to find a partial match - Regex not working as expected with C++ regex_match
-regex_search
finds the first match only. Usesregex_token_iterator
orsregex_iterator
to get all matches: see What does std::match_results::size return?
- When you read a user-defined string usingstd::string input; std::cin >> input;
, note thatcin
will only get to the first whitespace, to read the whole line properly, usestd::getline(std::cin, input);
- C++ Regex to match '+' quantifier
-"\d"
does not work, you need to use"\\d"
orR"(\d)"
(a raw string literal) - This regex doesn't work in c++
- Make sure the regex is tested against a literal text, not a string literal, see Regex_search c++go
- Double backslashes or use a raw string literal: Regular expression doesn't work in Go
- Go regex does not support lookarounds, select the right option (Go
) at regex101.com before testing! Regex expression negated set not working golanggroovy
- Return all matches: Regex that works on regex101 does not work in Groovyr
- Double escape backslashes in the string literal: "'\w' is an unrecognized escape" in grep
- Useperl=TRUE
to PCRE engine ((g)sub
/(g)regexpr
): Why is this regex using lookbehinds invalid in R?oracle
- Greediness of all quantifiers is set by the first quantifier in the regex, see Regex101 vs Oracle Regex (then, you need to make all the quantifiers as greedy as the first one)firebase
- Double escape backslashes, make sure^
only appears at the start of the pattern and$
is located only at the end (if any), and note you cannot use more than 9 inline backreferences: Firebase Rules Regex Birthdayfirebasegoogle-cloud-firestore
- In Firestore security rules, the regular expression needs to be passed as a string, which also means it shouldn't be wrapped in/
symbols, i.e. useallow create: if docId.matches("^\\d+$")
.... See https://stackoverflow.com/questions/63243300google-data-studio
-/pattern/g
inREGEXP_REPLACE
must contain no/
regex delimiters and flags (likeg
) - see How to use Regex to replace square brackets from date field in Google Data Studio?google-sheets
- If you thinkREGEXEXTRACT
does not return full matches, truncates the results, you should check if you have redundant capturing groups in your regex and remove them, or convert the capturing groups to non-capturing by add?:
after the opening(
, see Extract url domain root in Google Sheetsed
- Why does my regular expression work in X but not in Y?word-boundarypcrephp
-[[:<:]]
and[[:>:]]
do not work in the regex tester, although they are valid constructs in PCRE, see https://stackoverflow.com/questions/48670105
这篇关于正则表达式适用于 regex101.com,但不适用于 prod的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!