如何在 Pandas read_csv 函数中过滤加载行? [英] How can I filter lines on load in Pandas read_csv function?

查看：45 发布时间：2021/12/3 8:49:21 python pandas

本文介绍了如何在 Pandas read_csv 函数中过滤加载行?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何使用 Pandas 过滤要加载到内存中的 CSV 行?这似乎是一个应该在 read_csv 中找到的选项.我错过了什么吗?

How can I filter which lines of a CSV to be loaded into memory using pandas? This seems like an option that one should find in read_csv. Am I missing something?

示例:我们有一个带有时间戳列的 CSV，我们只想加载时间戳大于给定常量的行.

Example: we've a CSV with a timestamp column and we'd like to load just the lines that with a timestamp greater than a given constant.

推荐答案

在将 CSV 文件加载到 Pandas 对象之前没有筛选行的选项.

There isn't an option to filter the rows before the CSV file is loaded into a pandas object.

您可以加载文件，然后使用 df[df['field'] > 进行过滤.常量]，或者如果你有一个非常大的文件并且你担心内存耗尽，那么使用迭代器并在你连接文件块时应用过滤器，例如:

You can either load the file and then filter using df[df['field'] > constant], or if you have a very large file and you are worried about memory running out, then use an iterator and apply the filter as you concatenate chunks of your file e.g.:

import pandas as pd
iter_csv = pd.read_csv('file.csv', iterator=True, chunksize=1000)
df = pd.concat([chunk[chunk['field'] > constant] for chunk in iter_csv])

您可以改变 chunksize 以适合您的可用内存.请参阅此处更多详情.

You can vary the chunksize to suit your available memory. See here for more details.

这篇关于如何在 Pandas read_csv 函数中过滤加载行?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在 Pandas read_csv 函数中过滤加载行? [英] How can I filter lines on load in Pandas read_csv function?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在 Pandas read_csv 函数中过滤加载行? [英] How can I filter lines on load in Pandas read_csv function?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭