字符出现之间的正则表达式匹配 [英] Regular Expression Match between occurrence of character

查看:29
本文介绍了字符出现之间的正则表达式匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下字符串:

3#White House, District Of Columbia, United States#US#USDC#DC001#38.8951#-77.0364#531871#382

如您所见,字符串由 # 分隔.我的用例类似于一个简单的 SPLIT(string,"#") 操作,但正则表达式给了我更多的灵活性.

as you can see, the string is delimited by #'s. My use-case resembles a simple SPLIT(string,"#") operation but regex gives me a bit more flexibility.

我想匹配两次出现的 # 之间的字符.例如第二次和第三次出现之间的字符应该匹配:'US'

I would like to match the characters between two occurrences of #'s. for example the characters between the second and third occurrence should match: 'US'

我正在使用 Google Bigquery 并且能够匹配字符串的前两个术语,但无法匹配第三个:

I'm using Google Bigquery and was able to match the first two terms of the string but struggle with the third:

REGEXP_EXTRACT(locations,r'^d') as location_type,    
REGEXP_REPLACE(REGEXP_EXTRACT(locations,r'^d#.*?#'),r'^d*#|#','') as location_full_name, 
????

locations 是字符串,例如上面的那个.

locations are strings such as the one above.

我发现了这个 问题 但我有多个分隔符,并想指定匹配应该发生在哪些事件之间,例如第 2 次和第 5 次出现.

I've found this question but I have multiple delimeters and would like to specify between which occurences the match should take place e.g. 2 and 5th occurrence.

推荐答案

你可以使用像 ^(?:[^#]*#){N}([^#]*) 其中 N 是您需要的子串的数量减去 1.要获得 US,这是第三个值,您可以使用

You may use a regex like ^(?:[^#]*#){N}([^#]*) where N is the number of your required substring minus 1. To get US, which is the third value, you may use

^(?:[^#]*#){2}([^#]*)

查看正则表达式演示

详情

  • ^ - 字符串的开始
  • (?:[^#]*#){2} - 两个序列
    • [^#]* - 除 #
    • 之外的任何零个或多个字符
    • # - 一个 # 字符
    • ^ - start of string
    • (?:[^#]*#){2} - two sequences of
      • [^#]* - any zero or more chars other than #
      • # - a # char

      这篇关于字符出现之间的正则表达式匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆