当不询问时,Python正则表达式findall返回空字符串 [英] Python regex findall returns empty string when not asked

查看:444
本文介绍了当不询问时,Python正则表达式findall返回空字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从字符串列表中提取薪水. 我正在使用regex findall()函数,但它返回许多空字符串以及薪水,这在以后的代码中给我造成了问题.

I'm trying to extract salaries from a list of strings. I'm using the regex findall() function but it's returning many empty strings as well as the salaries and this is causing me problems later in my code.


sal= '41 000€ à 63 000€ / an' #this is a sample string for which i have errors

regex = ' ?([0-9]* ?[0-9]?[0-9]?[0-9]?)'#this is my regex

re.findall(regex,sal)[0]
#returns '41 000' as expected but:
re.findall(regex,sal)[1]
#returns: '' 
#Desired result : '63 000'

#the whole list of matches is like this:
['41 000',
 '',
 '',
 '',
 '',
 '',
 '',
 '63 000',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '',
 '']
# I would prefer ['41 000','63 000']

任何人都可以帮忙吗? 谢谢

Can anyone help? Thanks

推荐答案

使用 re.findall 会为您提供捕获组,而在使用组时,几乎所有内容都是可选的,从而在结果中提供空字符串.

Using re.findall will give you the capturing groups when you use them in your pattern and you are using a group where almost everything is optional giving you the empty strings in the result.

在您的模式中,您使用[0-9]*,它将匹配数字0+倍.如果前导位数没有限制,则可以使用[0-9]+使其不可选.

In your pattern you use [0-9]* which would match 0+ times a digit. If there is not limit to the leading digits, you might use [0-9]+ instead to not make it optional.

您可以将此模式与捕获组一起使用:

You might use this pattern with a capturing group:

(?<!\S)([0-9]+(?: [0-9]{1,3})?)€(?!\S)

正则表达式演示 | Python演示

说明

  • (?<!\S)断言左侧的字符不是非空格字符
  • (捕获组
    • [0-9]+(?: [0-9]{1,3})?匹配1位以上的数字,后跟匹配空格和1-3位数字的可选部分
    • (?<!\S) Assert what is on the left is not a non whitespace character
    • ( Capture group
      • [0-9]+(?: [0-9]{1,3})? match 1+ digits followed by an optional part that matches a space and 1-3 digits

      您的代码可能如下:

      import re
      sal= '41 000€ à 63 000€ / an' #this is a sample string for which i have errors
      regex = '(?<!\S)([0-9]+(?: [0-9]{1,3})?)€(?!\S)'
      print(re.findall(regex,sal))  # ['41 000', '63 000']
      

      这篇关于当不询问时,Python正则表达式findall返回空字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆