Python 3读取CSV文件且行中有换行符 [英] Python 3 reading CSV file with line breaks in rows

查看:514
本文介绍了Python 3读取CSV文件且行中有换行符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的CSV文件,其中只有一列,某些行中有换行符。我想读取每个单元格的内容并将其写入文本文件,但是CSV阅读器将带有换行符的单元格拆分为多个单元格(多行),并将每个单元格写入单独的文本文件。

I have a large CSV file with one column and line breaks in some of its rows. I want to read the content of each cell and write it to a text file but the CSV reader is splitting the cells with line breaks into multiple ones (multiple rows) and writing each one to a separate text file.

在MAC Sierra上使用Python 3.6.2

Using Python 3.6.2 on a MAC Sierra

这里是一个示例:

"content of row 1"
"content of row 2 
 continues here"
"content of row 3"

这是我的阅读方式:

with open(csvFileName, 'r') as csvfile:

    lines= csv.reader(csvfile)

    i=0
    for row in lines:
        i+=1
        content= row

        outFile= open("output"+str(i)+".txt", 'w')

        outFile.write(content)

        outFile.close()

这将创建4个文件,而不是每行3个。关于如何忽略第二行中的换行符的任何建议?

This is creating 4 files instead of 3 for each row. Any suggestions on how to ignore the line break in the second row?

推荐答案

您可以定义正则表达式模式来帮助您进行迭代在上。

You could define a regular expression pattern to help you iterate over the rows.

读取整个文件的内容-如果可能的话。

Read the entire file contents - if possible.

s = '''"content of row 1"
"content of row 2 
 continues here"
"content of row 3"'''

模式-双引号,然后是非双引号,然后是双引号-quote。:

Pattern - double-quote, followed by anything that isn't a double-quote, followed by a double-quote.:

row_pattern = '''"[^"]*"'''
row = re.compile(row_pattern, flags = re.DOTALL | re.MULTILINE)

迭代行:

for r in row.finditer(s):
    print r.group()
    print '******'

>>> 
"content of row 1"
******
"content of row 2 
 continues here"
******
"content of row 3"
******
>>>

这篇关于Python 3读取CSV文件且行中有换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆