加入在Python中具有相同第一列值的CSV文件的所有行 [英] Joining all rows of a CSV file that have the same 1st column value in Python

查看:323
本文介绍了加入在Python中具有相同第一列值的CSV文件的所有行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类似这样的CSV文件:

I have a CSV file that goes something like this:


['Name1','','',' '','','','','','','','','','' ','+']
['Name1','','','','','','b','','',
'','' ','','','','','','','','','']
['Name2','','',
' '','','','','','','','','','','','' b $ b'']
['Name3','','','','','+','','','','','','' ,'',
'','','','','','','']

['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+']
['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '']
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '']
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

现在,我需要一种方法将具有相同第一列名称的所有行连接到一个列中,例如:

Now, I need a way to join all of the rows that have the same 1st column name into one column, for instance:


['Name1','','','','','','b','','','','','','' ,'',
'','','','','+']
['Name2','','',
'','' ,'','','','','','','','','' ]
['Name3','','','','','+','','','','','','','' b $ b'','','','','','','']

['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', '+']
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', '']
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '']

我可以想出一种方法,通过排序CSV,然后通过每一行和列,比较每个值,但应该有一个更简单的方法来做到这一点。

I can think of a way to do this by sorting the CSV and then going trough each row and column and compare each value, but there should probably be an easier way to do it.

任何想法?

推荐答案

您应该使用itertools.groupby:

You should use itertools.groupby:

t = [ 
['Name1', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '+'],
['Name1', '', '', '', '', '', 'b', '', '', '', '', '', '', '', '', '', '', '', '', '', ''],
['Name2', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', 'a', ''],
['Name3', '', '', '', '', '+', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ''] 
]

from itertools import groupby

# TODO: if you need to speed things up you can use operator.itemgetter
# for both sorting and grouping
for name, rows in groupby(sorted(t), lambda x:x[0]):
    print join_rows(rows)

很明显,你可以在一个单独的函数中实现合并。例如像这样:

It's obvious that you'd implement the merging in a separate function. For example like this:

def join_rows(rows):
    def join_tuple(tup):
        for x in tup:
            if x: 
                return x
        else:
            return ''
    return [join_tuple(x) for x in zip(*rows)]

这篇关于加入在Python中具有相同第一列值的CSV文件的所有行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆