使用python读取csv中的特定列 [英] Read specific columns in csv using python

查看：1313 发布时间：2020/7/11 22:38:57 python sql csv

本文介绍了使用python读取csv中的特定列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个csv文件，如下所示:

I have a csv file that look like this:

+-----+-----+-----+-----+-----+-----+-----+-----+
| AAA | bbb | ccc | DDD | eee | FFF | GGG | hhh |
+-----+-----+-----+-----+-----+-----+-----+-----+
|   1 |   2 |   3 |   4 |  50 |   3 |  20 |   4 |
|   2 |   1 |   3 |   5 |  24 |   2 |  23 |   5 |
|   4 |   1 |   3 |   6 |  34 |   1 |  22 |   5 |
|   2 |   1 |   3 |   5 |  24 |   2 |  23 |   5 |
|   2 |   1 |   3 |   5 |  24 |   2 |  23 |   5 |
+-----+-----+-----+-----+-----+-----+-----+-----+

...

如何只读取python中的"AAA，DDD，FFF，GGG"列并跳过标题? 我想要的输出是看起来像这样的元组列表: [(1,4,3,20)，(2,5,2,23)，(4,6,1,22)].我正在考虑稍后将这些数据写入SQL数据库.

How can I only read the columns "AAA,DDD,FFF,GGG" in python and skip the headers? The output I want is a list of tuples that looks like this: [(1,4,3,20),(2,5,2,23),(4,6,1,22)]. I'm thinking to write these data to a SQLdatabase later.

我提到了这篇文章:从带有csv模块的csv文件?. 但是我认为这对我的情况没有帮助.由于我的.csv很大，有一整列的列，我希望我可以告诉python我想要的列名，以便python可以为我逐行读取特定的列.

I referred to this post:Read specific columns from a csv file with csv module?. But I don't think it is helpful in my case. Since my .csv is pretty big with whole bunch of columns, I hope I can tell python the column names I want, so python can read the specific columns row by row for me.

推荐答案

def read_csv(file, columns, type_name="Row"):
  try:
    row_type = namedtuple(type_name, columns)
  except ValueError:
    row_type = tuple
  rows = iter(csv.reader(file))
  header = rows.next()
  mapping = [header.index(x) for x in columns]
  for row in rows:
    row = row_type(*[row[i] for i in mapping])
    yield row

示例:

>>> import csv
>>> from collections import namedtuple
>>> from StringIO import StringIO
>>> def read_csv(file, columns, type_name="Row"):
...   try:
...     row_type = namedtuple(type_name, columns)
...   except ValueError:
...     row_type = tuple
...   rows = iter(csv.reader(file))
...   header = rows.next()
...   mapping = [header.index(x) for x in columns]
...   for row in rows:
...     row = row_type(*[row[i] for i in mapping])
...     yield row
... 
>>> testdata = """\
... AAA,bbb,ccc,DDD,eee,FFF,GGG,hhh
... 1,2,3,4,50,3,20,4
... 2,1,3,5,24,2,23,5
... 4,1,3,6,34,1,22,5
... 2,1,3,5,24,2,23,5
... 2,1,3,5,24,2,23,5
... """
>>> testfile = StringIO(testdata)
>>> for row in read_csv(testfile, "AAA GGG DDD".split()):
...   print row
... 
Row(AAA='1', GGG='20', DDD='4')
Row(AAA='2', GGG='23', DDD='5')
Row(AAA='4', GGG='22', DDD='6')
Row(AAA='2', GGG='23', DDD='5')
Row(AAA='2', GGG='23', DDD='5')

这篇关于使用python读取csv中的特定列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用python读取csv中的特定列 [英] Read specific columns in csv using python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用python读取csv中的特定列 [英] Read specific columns in csv using python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭