使用 postgres regexp 捕获一个点 [英] Capture a dot with postgres regexp

查看:50
本文介绍了使用 postgres regexp 捕获一个点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这些字符串:

3           FD160497.   2016  abcd
3           FD160497   2016  abcd

我想捕获FD"、数字,然后是点(如果存在).

I want to capture "FD", the digits, then the dot if it is present.

我试过了:

SELECT
    sqn[1] AS letters,
    sqn[2] AS digits,
    sqn[3] AS dot
FROM (
    SELECT
        regexp_matches(string, '.*?(FD)([0-9]{6})(\.)?.*') as sqn
    FROM
        mytable
) t;

(PostgreSQL 9.5.3)

(PostgreSQL 9.5.3)

"dot" 列在这两种情况下都是 NULL ,我真的不知道为什么.它适用于 regex101.

"dot" column is NULL in both cases, and I really don't know why. It works well on regex101.

推荐答案

第一个惰性模式使当前分支中的所有量词变为惰性,所以你的模式就变成了

The first lazy pattern made all quantifiers in the current branch lazy, so your pattern became equivalent to

.*?(FD)([0-9]{6})(\.)??.*?
                     ^^  ^

在 regex101.com 上查看 演示

See its demo at regex101.com

请参阅 9.7.3.1.正则表达式详情摘录:

See the 9.7.3.1. Regular Expression Details excerpt:

...匹配是以这样一种方式完成的,即分支或整个正则匹配整个可能的最长或最短子串.一旦确定了整个匹配的长度,匹配任何特定子表达式的部分将根据该子表达式的贪婪属性确定,在 RE 中较早开始的子表达式优先于较晚开始的子表达式.

...matching is done in such a way that the branch, or whole RE, matches the longest or shortest possible substring as a whole. Once the length of the entire match is determined, the part of it that matches any particular subexpression is determined on the basis of the greediness attribute of that subexpression, with subexpressions starting earlier in the RE taking priority over ones starting later.

您需要在一个分支内一致地使用量词:

You need to use quantifiers consistently within one branch:

regexp_matches(string, '.*(FD)([0-9]{6})(\.)?.*') as sqn

regexp_matches(string, '.*[[:blank:]](FD)([0-9]{6})(\.)?.*') as sqn

查看正则表达式演示

这篇关于使用 postgres regexp 捕获一个点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆