如何解析一个csv与python,当一列有多行 [英] How to parse a csv with python, when one column has multiple lines

查看:242
本文介绍了如何解析一个csv与python,当一列有多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个csv文件是名称,地点,事情。事情列通常有word\\\
anotherword\ananotherword\\\
我想知道如何解析这个单独的行,而不是在一个列中的多行条目。即

I have a csv file that is "name, place, thing". the thing column often has "word\nanotherword\nanotherword\n" I'm trying to figure out how to parse this out into individual lines instead of multiline entries in a single column. i.e.

名称,地点,字

code> name,place,anotherword

name, place, anotherword

name,place,anotherword

我确定这很简单,但我很难掌握我需要做什么。

I'm certain this is simple, but im having a hard time grasping what i need to do.

推荐答案

无需进入代码,基本上你想做的是检查在你的东西中是否有任何换行符。如果有,则需要在换行符上拆分它们。这将给你一个令牌列表('thing'中的行),因为这本质上是一个内循环,你可以使用原来的 name 放置以及新的 thing_token 。一个生成函数非常适合这个。

Without going into the code, essentially what you want to do is check to see if there are any newline characters in your 'thing'. If there are, you need to split them on the newline characters. This will give you a list of tokens (the lines in the 'thing') and since this is essentially an inner loop, you can use the original name and place along with your new thing_token. A generator function lends itself well to this.

这让我来到kroolik的答案。但是,kroolik的回答有一个小错误:

This is brings me to kroolik's answer. However, there's a slight error in kroolik's answer:

如果你想使用 column_wrapper 需要考虑csv读者在换行符中转义反斜杠的事实,因此它们看起来像 \\\\
而不是 \\\
。此外,您还需要检查空白的东西。

If you want to go with the column_wrapper generator, you will need to account for the fact that the csv reader escapes backslash in the newlines, so they look like \\n instead of \n. Also, you need to check for blank 'things'.

def column_wrapper(reader):
    for name, place, thing in reader:
        for split_thing in thing.strip().split('\\n'):
            if split_thing:
                yield name, place, split_thing

然后您可以获得如下数据:

Then you can obtain the data like this:

with open('filewithdata.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    data = [[data, name, thing] for data, name, thing in column_wrapper(reader)]

column_wrapper ):

OR (without column_wrapper):

data = []
with open('filewithdata.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        name, place, thing = tuple(row)
        if '\\n' in thing:
            for item in thing.split('\\n'):
                if item != '\n':
                    data.append([name, place, item)]

我建议使用 column_wrapper 作为生成器更通用和pythonic。

I recommend using column_wrapper as generators are more generic and pythonic.

一定要添加 import csv 你的文件(虽然我相信你知道已经)。希望有所帮助!

Be sure to add import csv to the top of your file (although I'm sure you knew that already). Hope that helps!

这篇关于如何解析一个csv与python,当一列有多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆