匹配文件对象中的多行正则表达式 [英] Match multiline regex in file object

查看：99 发布时间：2020/5/13 18:37:50 python regex multiline

本文介绍了匹配文件对象中的多行正则表达式的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何从文件对象(data.txt)的正则表达式中提取组?

How can I extract the groups from this regex from a file object (data.txt)?

import numpy as np
import re
import os
ifile = open("data.txt",'r')

# Regex pattern
pattern = re.compile(r"""
                ^Time:(\d{2}:\d{2}:\d{2})   # Time: 12:34:56 at beginning of line
                \r{2}                       # Two carriage return
                \D+                         # 1 or more non-digits
                storeU=(\d+\.\d+)
                \s
                uIx=(\d+)
                \s
                storeI=(-?\d+.\d+)
                \s
                iIx=(\d+)
                \s
                avgCI=(-?\d+.\d+)
                """, re.VERBOSE | re.MULTILINE)

time = [];

for line in ifile:
    match = re.search(pattern, line)
    if match:
        time.append(match.group(1))

代码最后一部分的问题是我逐行迭代，这显然不适用于多行正则表达式.我试图像这样使用pattern.finditer(ifile):

The problem in the last part of the code, is that I iterate line by line, which obviously doesn't work with multiline regex. I have tried to use pattern.finditer(ifile) like this:

for match in pattern.finditer(ifile):
    print match

...只是看它是否有效，但是finditer方法需要一个字符串或缓冲区.

... just to see if it works, but the finditer method requires a string or buffer.

我也尝试过这种方法，但是无法使它起作用

I have also tried this method, but can't get it to work

matches = [m.groups() for m in pattern.finditer(ifile)]

有什么主意吗?

在Mike和Tuomas发表评论后，我被告知要使用.read().类似这样的东西:

After comment from Mike and Tuomas, I was told to use .read().. Something like this:

ifile = open("data.txt",'r').read()

这很好用，但这是搜索文件的正确方法吗?无法正常工作...

This works fine, but would this be the correct way to search through the file? Can't get it to work...

for i in pattern.finditer(ifile):
    match = re.search(pattern, i)
    if match:
        time.append(match.group(1))

解决方案

# Open file as file object and read to string
ifile = open("data.txt",'r')

# Read file object to string
text = ifile.read()

# Close file object
ifile.close()

# Regex pattern
pattern_meas = re.compile(r"""
                ^Time:(\d{2}:\d{2}:\d{2})   # Time: 12:34:56 at beginning of line
                \n{2}                       # Two newlines
                \D+                         # 1 or more non-digits
                storeU=(\d+\.\d+)           # Decimal-number
                \s
                uIx=(\d+)                   # Fetch uIx-variable
                \s
                storeI=(-?\d+.\d+)          # Fetch storeI-variable
                \s
                iIx=(\d+)                   # Fetch iIx-variable
                \s
                avgCI=(-?\d+.\d+)           # Fetch avgCI-variable
                """, re.VERBOSE | re.MULTILINE)

file_times = open("output_times.txt","w")
for match in pattern_meas.finditer(text):
    output = "%s,\t%s,\t\t%s,\t%s,\t\t%s,\t%s\n" % (match.group(1), match.group(2), match.group(3), match.group(4), match.group(5), match.group(6))
    file_times.write(output)
file_times.close()

也许它可以写得更紧凑，更pythonic ....

Maybe it can be written more compact and pythonic though....

匹配文件对象中的多行正则表达式 [英] Match multiline regex in file object

问题描述

解决方案

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

匹配文件对象中的多行正则表达式 [英] Match multiline regex in file object

问题描述

解决方案

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭