SEd:用单个逗号替换空格字符,引号内除外 [英] SEd: replace whitespace characters with single comma except inside quotes
本文介绍了SEd:用单个逗号替换空格字符,引号内除外的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这一行来自汽车数据集(https://archive.ics.uci.edu/ml/datasets/Auto+MPG)看起来像这样:
This line is from a car dataset (https://archive.ics.uci.edu/ml/datasets/Auto+MPG) looking like this:
15.0 8. 429.0 198.0 4341. 10.0 70. 1. "ford galaxie 500"
如何用一个逗号替换多个空格(它有空格和制表符),但不在引号内,最好使用 sed,将数据集转换为真正的 csv.谢谢!
how would one replace the multiple whitespace (it has both space and tabs) w/ a single comma, but not inside the quotes, preferably using sed,to turn the dataset into a REAL csv. Thanks!
推荐答案
用 awk 来做:
awk -F'"' 'BEGIN { OFS="\"" } { for(i = 1; i <= NF; i += 2) { gsub(/[ \t]+/, ",", $i); } print }' filename.csv
使用 "
作为字段分隔符,每隔一个字段将成为应替换空格的行的一部分.然后:
Using "
as the field separator, every second field is going to be a part of the line where spaces should be replaced. Then:
BEGIN { OFS = FS } # output should also be separated by "
{
for(i = 1; i <= NF; i += 2) { # in every second field
gsub(/[ \t]+/, ",", $i) # replace spaces with commas
}
print # and print the whole shebang
}
这篇关于SEd:用单个逗号替换空格字符,引号内除外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文