在python中使用列数据解析csv文件 [英] Parsing a csv file with column data in Python

查看:100
本文介绍了在python中使用列数据解析csv文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想读取csv文件的前3列,并在存储它们之前进行一些修改. CSV文件中的数据:

I want to read the first 3 columns of a csv file and do some modification before storing them. Data in csv file:

{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\nTry Again,1,0,12

{::[name]str1_str2_str3[1]},2,U0.00 - Sensor2 Not Ready\nTry Again,1,0,12

{::[name]str1_str2_str3[1]},2,U0.00 - Sensor2 Not Ready\nTry Again,1,0,12

从column1,我只想解析[]中的值0或1. 然后在column2中的值 从column3,我想解析子字符串"Sensor1 Not Ready".然后转换为大写并用下划线替换空格(例如-SENSOR1_NOT_READY).然后在新列中打印字符串.

From the column1, I just want to parse the value 0 or 1 within the [ ]. Then the value in column2 From column3, I want to parse the substring "Sensor1 Not Ready". Then convert to upper case and replace the space with underscore (eg - SENSOR1_NOT_READY). And then print the string in a new column.

解析格式-

**<value from column 1>.<value from column 2>.<string from column 3>**

**<value from column 1>.<value from column 2>.<string from column 3>**

我是Python编码的新手.有人可以帮我弄这个吗?最佳和最有效的方法是什么? TIA

I am new to coding in Python. Can someone help me with this? What is the best and the most efficient way to do this? TIA

到目前为止我尝试过的-

What I have tried so far -

import csv
from collections import defaultdict

columns = defaultdict(list)

with open('filename.csv','rb') as f:
    reader = csv.reader(f, delimiter=',')
    for row in reader:
        for i in range(len(row)):
            columns[i].append(row[i])
    columns = dict(columns)

这是专栏3的好方法吗?

Is this a good way for Column 3?

x = # Parsed data from Column 3'
a, b = x.split("\n") # 'a' denotes the substring before \n
c, d = a.split("-") # 'd' denotes the substring after '-'
e = d.upper()
new_str = str.replace(" ", "_")
print new_str

推荐答案

我的建议是读取整行作为字符串,然后使用re模块提取所需的数据,如下所示:

My suggestion is to read a whole line as a string, and then extract desired data with re module like this:

import re

term = '\[(\d)\].*,(\d+),.*-\s([\w\s]+)\\n'

line = '{::[name]str1_str2_str3[0]},1,U0.00 - Sensor1 Not Ready\nTry Again,1,0,12'
capture = list(re.search(term, line).groups())
capture[-1] = '_'.join(capture[-1].split()).upper()
result = ','.join(capture)
#0,1,Sensor1_Not_Ready

这篇关于在python中使用列数据解析csv文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆