Python Pandas read_csv 跳过行但保留标题 [英] Python Pandas read_csv skip rows but keep header
问题描述
我无法弄清楚如何跳过 csv 文件中的 n 行但保留第 1 行的标题.
我想做的是迭代但保留第一行的标题.skiprows
使标题成为跳过的行之后的第一行.这样做的最佳方法是什么?
data = pd.read_csv('test.csv', sep='|', header=0, skiprows=10, nrows=10)
您可以将行号列表而不是整数传递给 skiprows
.
通过为函数提供整数 10,您只是跳过了前 10 行.
要保留第一行 0(作为标题),然后跳过其他所有行直到第 10 行,您可以这样写:
pd.read_csv('test.csv', sep='|', skiprows=range(1, 10))
<小时>
使用read_csv
跳过行的其他方法控制 read_csv
使用哪些行的两种主要方法是 header
或 skiprows
参数.
假设我们有以下包含一列的 CSV 文件:
a乙Cd电子F
在下面的每个例子中,这个文件是f = io.StringIO("
".join("abcdef"))
.
读取所有行作为值(无标题,默认为整数)
<预><代码>>>>pd.read_csv(f, header=None)00个1个2 厘米3天4 电子5 英尺使用特定行作为标题(跳过之前的所有行):
<预><代码>>>>pd.read_csv(f, header=3)d0 e1 英尺使用多行作为创建 MultiIndex 的标题(跳过最后指定标题行之前的所有行):
<预><代码>>>>pd.read_csv(f, header=[2, 4])C电子0 f从文件开头跳过 N 行(没有跳过的第一行是标题):
<预><代码>>>>pd.read_csv(f,skiprows=3)d0 e1 英尺通过给出行索引跳过一行或多行(没有跳过的第一行是标题):
<预><代码>>>>pd.read_csv(f, skiprows=[2, 4])一种0个1天2 英尺
I'm having trouble figuring out how to skip n rows in a csv file but keep the header which is the 1 row.
What I want to do is iterate but keep the header from the first row. skiprows
makes the header the first row after the skipped rows. What is the best way of doing this?
data = pd.read_csv('test.csv', sep='|', header=0, skiprows=10, nrows=10)
You can pass a list of row numbers to skiprows
instead of an integer.
By giving the function the integer 10, you're just skipping the first 10 lines.
To keep the first row 0 (as the header) and then skip everything else up to row 10, you can write:
pd.read_csv('test.csv', sep='|', skiprows=range(1, 10))
Other ways to skip rows using read_csv
The two main ways to control which rows read_csv
uses are the header
or skiprows
parameters.
Supose we have the following CSV file with one column:
a
b
c
d
e
f
In each of the examples below, this file is f = io.StringIO("
".join("abcdef"))
.
Read all lines as values (no header, defaults to integers)
>>> pd.read_csv(f, header=None) 0 0 a 1 b 2 c 3 d 4 e 5 f
Use a particular row as the header (skip all lines before that):
>>> pd.read_csv(f, header=3) d 0 e 1 f
Use a multiple rows as the header creating a MultiIndex (skip all lines before the last specified header line):
>>> pd.read_csv(f, header=[2, 4]) c e 0 f
Skip N rows from the start of the file (the first row that's not skipped is the header):
>>> pd.read_csv(f, skiprows=3) d 0 e 1 f
Skip one or more rows by giving the row indices (the first row that's not skipped is the header):
>>> pd.read_csv(f, skiprows=[2, 4]) a 0 b 1 d 2 f
这篇关于Python Pandas read_csv 跳过行但保留标题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!