在Python中对CSV进行排序 [英] Sorting CSV in Python

查看:581
本文介绍了在Python中对CSV进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我假设使用Python在多个文本/数字字段上对CSV文件进行排序是一个已经解决的问题。但是我在任何地方找不到任何示例代码,除了专注于排序日期字段的特定代码。



如何排序相对较大的CSV文件千行)在多个字段,按顺序?



Python代码示例将不胜感激。

解决方案

,返工以支持列数据类型:

  import csv 
import operator

def sort_csv (csv_filename,types,sort_key_columns):
sort(和重写)一个csv文件
types:文件中每一列的数据类型(转换函数)
sort_key_columns:column numbers的列排序为
data = []
with open(csv_filename,'rb')as f:
for row in csv.reader(f):
data.append(convert(types,row))
data.sort(key = operator.itemgetter(* sort_key_columns))
open(csv_filename,'wb')as f:
csv .writer(f).writerows(data)

p>

我做了一个蠢事。我在IDLE玩各种东西,并在几天前写了一个 convert 函数。我忘了我写了,我还没有关闭IDLE在一个很长的时间 - 所以当我写上面的,我认为 convert 是一个内置函数。可惜没有。



这是我的实现,虽然John Machin更好:

  def convert(types,values):
return [t(v)for t,v in zip(types,values)]

用法:

  import datetime 
def date: b $ b return datetime.strptime(s,'%m /%d /%y')

>>> convert((int,date,str),('1','2/15/09','z'))
[1,datetime.datetime(2009,2,15,0,0) 'z']


I assumed sorting a CSV file on multiple text/numeric fields using Python would be a problem that was already solved. But I can't find any example code anywhere, except for specific code focusing on sorting date fields.

How would one go about sorting a relatively large CSV file (tens of thousand lines) on multiple fields, in order?

Python code samples would be appreciated.

解决方案

Here's Alex's answer, reworked to support column data types:

import csv
import operator

def sort_csv(csv_filename, types, sort_key_columns):
    """sort (and rewrite) a csv file.
    types:  data types (conversion functions) for each column in the file
    sort_key_columns: column numbers of columns to sort by"""
    data = []
    with open(csv_filename, 'rb') as f:
        for row in csv.reader(f):
            data.append(convert(types, row))
    data.sort(key=operator.itemgetter(*sort_key_columns))
    with open(csv_filename, 'wb') as f:
        csv.writer(f).writerows(data)

Edit:

I did a stupid. I was playing with various things in IDLE and wrote a convert function a couple of days ago. I forgot I'd written it, and I haven't closed IDLE in a good long while - so when I wrote the above, I thought convert was a built-in function. Sadly no.

Here's my implementation, though John Machin's is nicer:

def convert(types, values):
    return [t(v) for t, v in zip(types, values)]

Usage:

import datetime
def date(s):
    return datetime.strptime(s, '%m/%d/%y')

>>> convert((int, date, str), ('1', '2/15/09', 'z'))
[1, datetime.datetime(2009, 2, 15, 0, 0), 'z']

这篇关于在Python中对CSV进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆