Python 3读取CSV文件且行中有换行符 [英] Python 3 reading CSV file with line breaks in rows
问题描述
我有一个很大的CSV文件,其中只有一列,某些行中有换行符。我想读取每个单元格的内容并将其写入文本文件,但是CSV阅读器将带有换行符的单元格拆分为多个单元格(多行),并将每个单元格写入单独的文本文件。
I have a large CSV file with one column and line breaks in some of its rows. I want to read the content of each cell and write it to a text file but the CSV reader is splitting the cells with line breaks into multiple ones (multiple rows) and writing each one to a separate text file.
在MAC Sierra上使用Python 3.6.2
Using Python 3.6.2 on a MAC Sierra
这里是一个示例:
"content of row 1"
"content of row 2
continues here"
"content of row 3"
这是我的阅读方式:
with open(csvFileName, 'r') as csvfile:
lines= csv.reader(csvfile)
i=0
for row in lines:
i+=1
content= row
outFile= open("output"+str(i)+".txt", 'w')
outFile.write(content)
outFile.close()
这将创建4个文件,而不是每行3个。关于如何忽略第二行中的换行符的任何建议?
This is creating 4 files instead of 3 for each row. Any suggestions on how to ignore the line break in the second row?
推荐答案
您可以定义正则表达式模式来帮助您进行迭代在行上。
You could define a regular expression pattern to help you iterate over the rows.
读取整个文件的内容-如果可能的话。
Read the entire file contents - if possible.
s = '''"content of row 1"
"content of row 2
continues here"
"content of row 3"'''
模式-双引号,然后是非双引号,然后是双引号-quote。:
Pattern - double-quote, followed by anything that isn't a double-quote, followed by a double-quote.:
row_pattern = '''"[^"]*"'''
row = re.compile(row_pattern, flags = re.DOTALL | re.MULTILINE)
迭代行:
for r in row.finditer(s):
print r.group()
print '******'
>>>
"content of row 1"
******
"content of row 2
continues here"
******
"content of row 3"
******
>>>
这篇关于Python 3读取CSV文件且行中有换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!