python - 从文本的特定行读取文件 [英] python - Read file from and to specific lines of text

查看:243
本文介绍了python - 从文本的特定行读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不是在说具体的行号,因为我正在读取同一格式的多个文件,但长度有所不同。

说我有这个文本文件:

 这里的东西... 
... ...
开始#我想要这个文本块
abcdefg
hijklmn
结束#until这一行文件
这里的东西...
... ...
/ pre>

我希望你知道我的意思。我正在考虑遍历文件,然后使用正则表达式搜索Start和End的行号,然后使用linecache从开始行到End行。
但是如何得到行号?我可以使用什么功能?

解决方案

如果你只是想在开始和结束之间想要文本块,你可以做一些简单的事情:

  with open('test.txt')as input_data:
#在有趣的块开始之前跳过文本:
对于input_data中的行:
如果line.strip()=='开始':#或任何需要的测试
break
#读取文本直到块的结尾:
对于input_data中的行:#这样继续读取文件
如果line.strip()=='End':
break
打印行#提取行(或block_of_lines.append(行)等等)

其实你不需要操纵行号才能读取数据在开始和结束标记之间。



逻辑(直到...读)在两个块中重复,但它是非常清楚和有效的(其他方法通常涉及检查一些状态ck / in block / end of block达到],这会引起时间惩罚)。


I'm not talking about specific line numbers because i'm reading multiple files with the same format but vary in length.
Say i have this text file:

Something here...  
... ... ...   
Start                      #I want this block of text 
a b c d e f g  
h i j k l m n  
End                        #until this line of the file
something here...  
... ... ...  

I hope you know what i mean. i was thinking of iterating through the file then search using regular expression to find the line number of "Start" and "End" then use linecache to read from Start line to End line. But how to get the line number? what function can i use?

解决方案

If you simply want the block of text between Start and End, you can do something simple like:

with open('test.txt') as input_data:
    # Skips text before the beginning of the interesting block:
    for line in input_data:
        if line.strip() == 'Start':  # Or whatever test is needed
            break
    # Reads text until the end of the block:
    for line in input_data:  # This keeps reading the file
        if line.strip() == 'End':
            break
        print line  # Line is extracted (or block_of_lines.append(line), etc.)

In fact, you do not need to manipulate line numbers in order to read the data between the Start and End markers.

The logic ("read until…") is repeated in both blocks, but it is quite clear and efficient (other methods typically involve checking some state [before block/within block/end of block reached], which incurs a time penalty).

这篇关于python - 从文本的特定行读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆