用正则表达式结果填充列表python [英] Filling a list with regular expression results python

查看:76
本文介绍了用正则表达式结果填充列表python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在一个字符串中搜索特定的子字符串并使用 re 将这些子字符串存储在一个列表中,我该怎么做?

I would like to search a string for specific substrings and store these substrings in a list using re, how would I go about doing that?

这是我目前的代码:

#!/usr/bin/env python
from sys import stdin
import re as reg

regex = reg.compile(r"\".*\"")#match "  match me  "
line = stdin.readline().strip().split()
myList = [ match for match in regex.finditer(line) ]
print myList

这是示例输入:

"RUn.exe O" "" "   2ne, " two! . " "

这个预期的输出,不需要添加括号,这只是我澄清所需的特定匹配:

This expected output, No need to add the brackets, that was just me clarifying the specific matches needed:

<RUn.exe O>
<>
<   2ne, >
<two!>
<.>
< >

所以基本上,引号之间的任何内容都是输出的一部分,应该写入列表,而没有引号的任何内容都应该写入列表

So basically, anything between quotes is part of the output and should be written to the list and anything without quotes should just be written to the list

谢谢各位

附言我在哪里可以了解在 python 中使用正则表达式?我很乐意在 grep 或 awk 中使用正则表达式,但我还是 Python 新手,我特别喜欢编译正则表达式以反复使用的想法,但我不知道从哪里可以了解更多信息

P.S. where can I learn about using regex in python? I am comfortable using regex with grep or awk, but I am still new to python and I especially like the idea of compiling a regex to be used over and over, but I don't know where to learn more about that

推荐答案

我认为这就是你要找的,虽然你可以不用 for 循环...

I think this is what you're looking for, you might be able to do without the for-loop though...

line = '"RUn.exe O" "" "   2ne, " two! . " "'
import re
regex = re.compile(r'"[^"]*"|[^\s]+')
matches = [el.strip('"') for el in regex.findall(line)]

print '\n'.join(matches)

你可以同时使用 ' 和 " 在 python 中创建字符串.当我使用 ' 创建字符串时,遇到 " 时字符串不会结束,我不必转义它.如果您要创建一个包含 ' 或 " 的字符串,使用另一个作为开始/停止会很方便.

You can use both ' and " to make a string in python. As I make the string using ', the string doesn't end when encountering a ", and I don't have to escape it. If you're making a string that contains either ' or ", it's convenient to use the other one as a start/stop.

正则表达式的工作原理如下:首先找到一个 ". [^"] 表示任何不是 (^) a " 的东西,并匹配任意数量的这个 (*).然后找到另一个 ".同样,[^\s] 表示任何不是空格的东西,而 + 表示一个或多个.

The regex works as follows: first find a ". [^"] means anything that is not (^) a ", and match any number of this (*). Then find another ". Similarly the [^\s] means anything that is not a whitespace, and the + means one or more.

re 的文档中可能有线索:http://docs.python.org/2.7/library/re.html#match-objects

There might be a clue in re's documentation: http://docs.python.org/2.7/library/re.html#match-objects

这篇关于用正则表达式结果填充列表python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆