源文件定界符问题 [英] Source file delimiter issue
本文介绍了源文件定界符问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的源文件遇到一个问题.考虑我文件中有以下数据-
I am facing one issue with my source file. Consider I have following data in file-
"dfjsdlfkj,fsdkfj,werkj",234234,234234,,"dfsd,etwetr"
在这里,定界符是逗号,但是某些字段将逗号作为数据的一部分.此类字段用双引号引起来.我想从文件中提取几列.
here, the delimiter is comma, but some fields have comma as a part of data. Such fields are enclosed in double quotes. I want to extract few columns from the file.
如果我使用cut -d "," -f 1,3
,那么我将得到类似-
If I use cut -d "," -f 1,3
then I am getting output like-
"dfjsdlfkj,werkj"
推荐答案
我建议您使用csv
解析器.例如, python 的问题有一个内置问题模块,因此您只需导入它:
I suggest you to use a csv
parser. For example, python has one as a built-in module, so you only have to import it:
import sys
import csv
with open(sys.argv[1], newline='') as csvfile:
csvreader = csv.reader(csvfile)
csvwriter = csv.writer(sys.stdout)
for row in csvreader:
csvwriter.writerow([row[e] for e in (0,2)])
假设示例行位于名为infile
的输入文件中,则以以下方式运行脚本:
Assuming your example line is in an input file named infile
, run the script as:
python3 script.py infile
结果是:
"dfjsdlfkj,fsdkfj,werkj",234234
这篇关于源文件定界符问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文