如何从CSV提取命名列? [英] How to extract named columns from a CSV?

查看:186
本文介绍了如何从CSV提取命名列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含约50列的csv文件,但我只需要其中的10列。我希望能够将需要的列从该csv文件提取到新的csv文件。

I have a csv file that contains around 50 columns, but I only need about 10 of them. I want to be able to extract the columns I need from that csv file to a new csv file.

这篇文章的最佳答案如何删除CSV文件中的列?看起来它完全可以满足我的需要。

The top answer in this post How to delete columns in a CSV file? looks like it will do exactly what I need.

但这是我每天需要做的事情,生成大CSV文件的系统可以按不同顺序导出列。因此,我需要能够通过名称而不是数字来指定我需要的列。

BUT this is something I will need to do daily, and the system that generates the big CSV file can export the columns in different orders. So I need to be able to specify the columns I need by name, rather than by number.

以下表示CSV文件:

File1.csv

File1.csv

name, description, cost, image, date
ABC, "super, mega", 12.87, ./imagefile, "12/11/2012 08:12"

File2.csv

File2.csv

name, cost, date, description, image
SYZ, 43.98, "16/11/2012 09:16", "Some text, and such", ./image2.jpeg

我只想保留名称,描述和图像字段,但是如果我使用代码(来自上面的帖子,@S.Lott ):

I want to keep the name, description and image fields only, but if I use the code (derived form the post above by @S.Lott):

import csv
with open("source","rb") as source:
rdr= csv.reader( source )
with open("result","wb") as result:
    wtr= csv.writer( result )
    for r in rdr:
        wtr.writerow( (r[0], r[1], r[3]) )

它仅适用于第一个文件,不适用于第二个文件。

It will only work for the first file and not the second.

推荐答案

为此使用 pandas 的优势在于,它不仅易于打开和以不同的格式和文件保存文件修改列和行,还因为您还可以根据需要修改,计算和播放数据。

The advantage of using pandas for this is that not only it makes easy to open and save your files in different formats and modify columns and rows, but also because you can also modify, calculate and play with your data if you need it.

要获取具有选定列的csv文件是很简单的:

To obtain a csv file with selected columns is straighforward:

import pandas as p

df = p.read_csv('File2.csv')  # reads your csv file as a table (dataframe object)

df2 = df[['cost', 'date']]    # selects two of the columns in your file

df2.to_csv('my_out.csv')      # saves again in csv format

这篇关于如何从CSV提取命名列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆