处理CSV数据时如何忽略第一行数据? [英] How to ignore the first line of data when processing CSV data?
问题描述
我要Python从一列CSV数据中打印最少的数字,但是最上面的一行是列号,我不希望Python考虑到最上面的一行.如何确保Python忽略第一行?
I am asking Python to print the minimum number from a column of CSV data, but the top row is the column number, and I don't want Python to take the top row into account. How can I make sure Python ignores the first line?
这是到目前为止的代码:
This is the code so far:
import csv
with open('all16.csv', 'rb') as inf:
incsv = csv.reader(inf)
column = 1
datatype = float
data = (datatype(column) for row in incsv)
least_value = min(data)
print least_value
您是否还可以解释自己在做什么,而不仅仅是给出代码?我对Python非常陌生,并希望确保我了解所有内容.
Could you also explain what you are doing, not just give the code? I am very very new to Python and would like to make sure I understand everything.
推荐答案
You could use an instance of the csv
module's Sniffer
class to deduce the format of a CSV file and detect whether a header row is present along with the built-in next()
function to skip over the first row only when necessary:
import csv
with open('all16.csv', 'r', newline='') as file:
has_header = csv.Sniffer().has_header(file.read(1024))
file.seek(0) # Rewind.
reader = csv.reader(file)
if has_header:
next(reader) # Skip header row.
column = 1
datatype = float
data = (datatype(row[column]) for row in reader)
least_value = min(data)
print(least_value)
由于在您的示例中datatype
和column
是硬编码的,因此像这样处理row
会稍微快一些:
Since datatype
and column
are hardcoded in your example, it would be slightly faster to process the row
like this:
data = (float(row[1]) for row in reader)
注意:上面的代码适用于Python3.x.对于Python 2.x,请使用以下行来打开文件,而不是显示的内容:
Note: the code above is for Python 3.x. For Python 2.x use the following line to open the file instead of what is shown:
with open('all16.csv', 'rb') as file:
这篇关于处理CSV数据时如何忽略第一行数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!