源文件定界符问题 [英] Source file delimiter issue

查看:67
本文介绍了源文件定界符问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的源文件遇到一个问题.考虑我文件中有以下数据-

I am facing one issue with my source file. Consider I have following data in file-

"dfjsdlfkj,fsdkfj,werkj",234234,234234,,"dfsd,etwetr"

在这里,定界符是逗号,但是某些字段将逗号作为数据的一部分.此类字段用双引号引起来.我想从文件中提取几列.

here, the delimiter is comma, but some fields have comma as a part of data. Such fields are enclosed in double quotes. I want to extract few columns from the file.

如果我使用cut -d "," -f 1,3,那么我将得到类似-

If I use cut -d "," -f 1,3 then I am getting output like-

"dfjsdlfkj,werkj"

推荐答案

我建议您使用csv解析器.例如,的问题有一个内置问题模块,因此您只需导入它:

I suggest you to use a csv parser. For example, python has one as a built-in module, so you only have to import it:

import sys 
import csv 

with open(sys.argv[1], newline='') as csvfile:
    csvreader = csv.reader(csvfile)
    csvwriter = csv.writer(sys.stdout)
    for row in csvreader:
        csvwriter.writerow([row[e] for e in (0,2)])

假设示例行位于名为infile的输入文件中,则以以下方式运行脚本:

Assuming your example line is in an input file named infile, run the script as:

python3 script.py infile

结果是:

"dfjsdlfkj,fsdkfj,werkj",234234

这篇关于源文件定界符问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆