如何使用 pandas 读取csv中的特定行号 [英] How to read a specific line number in a csv with pandas

查看：224 发布时间：2020/7/11 21:50:55 python pandas csv dataframe

本文介绍了如何使用 pandas 读取csv中的特定行号的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

我有一个庞大的数据集，我正在尝试逐行读取它. 现在，我正在使用pandas读取数据集:

I have a huge dataset and I am trying to read it line by line. For now, I am reading the dataset using pandas:

df = pd.read_csv("mydata.csv", sep =',', nrows = 1)

此功能仅允许我读取第一行，但是如何读取第二行，第三行，依此类推? (我想用熊猫.)

This function allows me to read only the first line, but how can I read the second, the third one and so on? (I would like to use pandas.)

为了更加清楚，我需要一次读取一行，因为数据集为20 GB，并且我无法将所有内容都保留在内存中.

To make it more clear, I need to read one line at a time as the dataset is 20 GB and I cannot keep all the stuff in memory.

在pandas文档中，有一个read_csv函数的参数:

Looking in the pandas documentation, there is a parameter for read_csv function:

skiprows

如果为该参数分配了一个列表，它将跳过该列表索引的行:

If a list is assigned to this parameter it will skip the line indexed by the list:

skiprows = [0,1]

这将跳过第一行和第二行. 因此，nrow和skiprows的组合允许分别读取数据集中的每一行.

This will skip the first one and the second line. Thus a combination of nrow and skiprows allow to read each line in the dataset separately.

这篇关于如何使用 pandas 读取csv中的特定行号的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文