在python中使用列数据解析csv文件 [英] Parsing a csv file with column data in Python
问题描述
我想读取csv文件的前3列,并在存储它们之前进行一些修改. CSV文件中的数据:
I want to read the first 3 columns of a csv file and do some modification before storing them. Data in csv file:
{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\nTry Again,1,0,12
{::[name]str1_str2_str3[1]},2,U0.00 - Sensor2 Not Ready\nTry Again,1,0,12
{::[name]str1_str2_str3[1]},2,U0.00 - Sensor2 Not Ready\nTry Again,1,0,12
从column1,我只想解析[]中的值0或1. 然后在column2中的值 从column3,我想解析子字符串"Sensor1 Not Ready".然后转换为大写并用下划线替换空格(例如-SENSOR1_NOT_READY).然后在新列中打印字符串.
From the column1, I just want to parse the value 0 or 1 within the [ ]. Then the value in column2 From column3, I want to parse the substring "Sensor1 Not Ready". Then convert to upper case and replace the space with underscore (eg - SENSOR1_NOT_READY). And then print the string in a new column.
解析格式-
**<value from column 1>.<value from column 2>.<string from column 3>**
**<value from column 1>.<value from column 2>.<string from column 3>**
我是Python编码的新手.有人可以帮我弄这个吗?最佳和最有效的方法是什么? TIA
I am new to coding in Python. Can someone help me with this? What is the best and the most efficient way to do this? TIA
到目前为止我尝试过的-
What I have tried so far -
import csv
from collections import defaultdict
columns = defaultdict(list)
with open('filename.csv','rb') as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
for i in range(len(row)):
columns[i].append(row[i])
columns = dict(columns)
这是专栏3的好方法吗?
Is this a good way for Column 3?
x = # Parsed data from Column 3'
a, b = x.split("\n") # 'a' denotes the substring before \n
c, d = a.split("-") # 'd' denotes the substring after '-'
e = d.upper()
new_str = str.replace(" ", "_")
print new_str
推荐答案
我的建议是读取整行作为字符串,然后使用re
模块提取所需的数据,如下所示:
My suggestion is to read a whole line as a string, and then extract desired data with re
module like this:
import re
term = '\[(\d)\].*,(\d+),.*-\s([\w\s]+)\\n'
line = '{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\nTry Again,1,0,12'
capture = list(re.search(term, line).groups())
capture[-1] = '_'.join(capture[-1].split()).upper()
result = ','.join(capture)
#0,1,Sensor1_Not_Ready
这篇关于在python中使用列数据解析csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!