pandas :将文件读入DataFrame时，忽略特定字符串后的所有行 [英] Pandas: ignore all lines following a specific string when reading a file into a DataFrame

查看：147 发布时间：2020/5/23 23:20:41 python pandas

本文介绍了 pandas :将文件读入DataFrame时，忽略特定字符串后的所有行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个Pandas DataFrame，可以总结为:

I have a pandas DataFrame which can be summarized as this:

[Header]
Some_info = some_info
[Data]
Col1    Col2
0.532   Point
0.234   Point
0.123   Point
1.455   Square
14.64   Square
[Other data]
Other1  Other2
Test1   PASS
Test2   FAIL

我的目标是仅读取[Data]和[Other data]之间的文本部分，该部分是可变的(不同长度).标头的长度始终相同，因此可以使用pandas.read_csv中的skiprows.但是，skipfooter需要行数才能跳过，这可以在文件之间更改.

My goal is to read only the portion of text between [Data] and [Other data], which is variable (different length). The header has always the same length, so skiprows from pandas.read_csv can be used. However, skipfooter needs the number of lines to skip, which can change between files.

什么是最好的解决方案?除非没有其他解决方案，否则我想避免从外部更改文件.

What would be the best solution here? I would like to avoid altering the file externally unless there's no other solution.

推荐答案

此方法必须对文件运行两次.

This method has to run over the file twice.

import itertools as it

def get_footer(file_):
    with open(file_) as f:
        g = it.dropwhile(lambda x: x != '[Other data]\n', f)
        footer_len = len([i for i, _ in enumerate(g)])
    return footer_len

footer_len = get_footer('file.txt')
df = pd.read_csv('file.txt', … skipfooter=footer_len)

这篇关于 pandas :将文件读入DataFrame时，忽略特定字符串后的所有行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas :将文件读入DataFrame时，忽略特定字符串后的所有行 [英] Pandas: ignore all lines following a specific string when reading a file into a DataFrame

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas :将文件读入DataFrame时，忽略特定字符串后的所有行 [英] Pandas: ignore all lines following a specific string when reading a file into a DataFrame

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭