使用 postgres regexp 捕获一个点 [英] Capture a dot with postgres regexp
问题描述
我有这些字符串:
3 FD160497. 2016 abcd
3 FD160497 2016 abcd
我想捕获FD"、数字,然后是点(如果存在).
I want to capture "FD", the digits, then the dot if it is present.
我试过了:
SELECT
sqn[1] AS letters,
sqn[2] AS digits,
sqn[3] AS dot
FROM (
SELECT
regexp_matches(string, '.*?(FD)([0-9]{6})(\.)?.*') as sqn
FROM
mytable
) t;
(PostgreSQL 9.5.3)
(PostgreSQL 9.5.3)
"dot" 列在这两种情况下都是 NULL
,我真的不知道为什么.它适用于 regex101.
"dot" column is NULL
in both cases, and I really don't know why.
It works well on regex101.
推荐答案
第一个惰性模式使当前分支中的所有量词变为惰性,所以你的模式就变成了
The first lazy pattern made all quantifiers in the current branch lazy, so your pattern became equivalent to
.*?(FD)([0-9]{6})(\.)??.*?
^^ ^
在 regex101.com 上查看 演示
See its demo at regex101.com
请参阅 9.7.3.1.正则表达式详情摘录:
See the 9.7.3.1. Regular Expression Details excerpt:
...匹配是以这样一种方式完成的,即分支或整个正则匹配整个可能的最长或最短子串.一旦确定了整个匹配的长度,匹配任何特定子表达式的部分将根据该子表达式的贪婪属性确定,在 RE 中较早开始的子表达式优先于较晚开始的子表达式.
...matching is done in such a way that the branch, or whole RE, matches the longest or shortest possible substring as a whole. Once the length of the entire match is determined, the part of it that matches any particular subexpression is determined on the basis of the greediness attribute of that subexpression, with subexpressions starting earlier in the RE taking priority over ones starting later.
您需要在一个分支内一致地使用量词:
You need to use quantifiers consistently within one branch:
regexp_matches(string, '.*(FD)([0-9]{6})(\.)?.*') as sqn
或
regexp_matches(string, '.*[[:blank:]](FD)([0-9]{6})(\.)?.*') as sqn
查看正则表达式演示
这篇关于使用 postgres regexp 捕获一个点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!