Python.在1行上加入特定行 [英] Python. Join specific lines on 1 line

查看:86
本文介绍了Python.在1行上加入特定行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有此文件:

1
17:02,111
Problem report related to
router

2
17:05,223
Restarting the systems

3
18:02,444
Must erase hard disk
now due to compromised data

我想要这个输出:

1
17:02,111
Problem report related to router

2
17:05,223
Restarting the systems

3
18:02,444
Must erase hard disk now due to compromised data

曾经尝试使用bash并获得了一种封闭的解决方案,但我不知道如何在Python上实现这一目标.

Been trying in bash and got to a kind of close solution but I don't know how to carry this out on Python.

提前谢谢

推荐答案

如果要删除extea行:

为此,如果该行后没有空的新行,或者该行前应有与以下正则表达式^\d{2}:\d{2},\d{3}\s$相匹配的行,则可以为每个条件检查2个条件.

If you want to remove the extea lines :

For this aim you can check 2 condition for each like one if the line don't followed by an empty new line, or line should precede by a line that match with following regex ^\d{2}:\d{2},\d{3}\s$.

因此,要访问每次迭代的下一行,您可以使用temp的主文件对象创建一个文件对象. .html#itertools.tee"rel =" nofollow> itertools.tee 并在其上应用next函数.并使用re.match匹配正则表达式.

So for access to next line in each iteration you can create one file object from your main file object with the name temp using itertools.tee and apply the next function on it. and use re.match to match the regex.

from itertools import tee
import re
with open('ex.txt') as f,open('new.txt','w') as out:
    temp,f=tee(f)
    next(temp)
    try:
        for line in f:
            if next(temp) !='\n' or re.match(r'^\d{2}:\d{2},\d{3}\s$',pre):
                out.write(line)
            pre=line
    except :
        pass

结果:

1
17:02,111
Problem report related to

2
17:05,223
Restarting the systems

3
18:02,444
Must erase hard disk


如果要将其余部分连接到第三行:

如果要将第三行之后的其余行连接到第三行,可以使用以下正则表达式查找所有后跟\n\n或文件末尾($)的块:


If you want to concatenate the rest to third line :

And if you want to concatenate the rest lines after third line to third line you can use following regex to find all blocks that followed by \n\n or the end of file ($) :

r"(.*?)(?=\n\n|$)"

然后根据日期格式的行拆分块,并将各部分写入输出文件,但是请注意,您需要用空格替换第3部分中的新行:

then split your blocks based on the line that in in a date format and write the parts to your output file, but note that you need to replace the new lines within 3rd part with space :

ex.txt:

1
17:02,111
Problem report related to
router
another line


2
17:05,223
Restarting the systems

3
18:02,444
Must erase hard disk
now due to compromised data
line 5
line 6
line 7

演示:

def splitter(s):
    for x in re.finditer(r"(.*?)(?=\n\n|$)", s,re.DOTALL):
          g=x.group(0)
          if g:
            yield g

import re
with open('ex.txt') as f,open('new.txt','w') as out:
    for block in splitter(f.read()):
        first,second,third= re.split(r'(\d{2}:\d{2},\d{3}\n)',block)
        out.write(first+second+third.replace('\n',' '))

结果:

1
17:02,111
Problem report related to router another line
2
17:05,223
Restarting the systems
3
18:02,444
Must erase hard disk now due to compromised data line 5 line 6 line 7

注意:

在这个答案中,splitter函数返回一个生成器,当您处理大型文件并拒绝将不可用的行存储在内存中时,该生成器非常有效.

Note :

In this answer the splitter function returns a generator that is very efficient when you are dealing with huge files and refuse of storing unusable lines in memory.

这篇关于Python.在1行上加入特定行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆